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summary 


, The  HllSMrtation  tzeete  the  dynamics  of  a decision  maker's  value  of 
Infomatlon^  There  are  two  main  parts « a section  on  the  depreciation 
(perishing)  of  information  and  a section  on  the  appreciation  (replenish- 
ment) of  Informtlon. 

k notion,  widely  held  by  decision  analysts  but  tenuously  defined.  Is 
that  the  value  of  any  specific  information  diminishes  over  time.  This 
concept,  termed  information  perishing,  is  rigorously  defined  and  illustra- 
ted by  the  use  of  a Markw  mAdel^n  the  first  section  of  the  study. 

The  Biain  assertions  of  thi^'section  are: 

Information  perishing  is  inevitable  (not  only  for  the  Markov 
model  of  information  but  for  any  state  of  information  described 
by  a probability  distribution) . 

J 

( For  the  Markov  model  the  absolute  value  of  the  largest  transient 
eigenvalue  is  an  upper  bound  for  the  rate  of  information  perish- 


^ 3)  The  rate  of  perishing  is  a decreasing  function  of  time.  , 

A short  transition  section  alters  the  basic  decision  model  to  allow 
an  element  of  uncertainty  for  the  exact  timing  of  the  decision.  Basically 
the  n^model 


of  the  decision  process  recognises  that  stany  decisions  in 
real  life  are  * triggered^  by  events  which  amy  be  described  by  some  sto- 
chastic process.  Without  this  uncertainty  the  decision  maker  could  simply 
discount  the  value  of  information  because  of  perishing  and  would  reduce 
his  problem  to  a static  casej  >tfowever,  the  uncertainty  in  timing  forces 
consideration  of  optimal  policies  of  information  rsplenishamnt.  the  second 
main  area  of  the  thesis. 

The  major  results  of  this  section  are: 

1.  ^iules  of  optimality  are  developed  for  singly  and  multiple  occur- 

ring decisions. 

2.  The  optimality  of  periodic  replenishment  (under  certain  limiting 
conditions),  is  established. 

3.  The  suggestion  that  some  of  the  research  results  of  reliability 
and  maintainability  theory  nay  be  applied  to  information  replen- 
ishment strategy. 

The  thasia  eloaea*  with  the  customary  delineation  of  areas  of  further 
applieation  and  reaaarcb. 


COUTCUTS 


SUMMARY 11 

FIGURES  V 

TABLES  vll 

ACKNOWLEDGMENTS  vlll 

CHAPTER 

1 XmCDUCTlON 1 

2 INFORMATION  DYNAMICS S 

2.1  Purpose  — 5 

2.2  Introduction  — - — 5 

2.3  Example  1,  A Two-State  Markovian  Case — 5 

2.4  The  Optimal  Decision  with  Only  Prior  Knowledge  6 

2.5  Optiaial  Strategy  with  Perfect  Information  — - 11 

2.6  Basic  Definitions  — - — 14 

2.7  Generalisations  — — 17 

2.8  Summary - r 33 

3 THE  DECISION  MODEL 35 

3.1  Purpose  — — — — — — — ——————  35 

3.2  A Historical  Example  ————————  35 

3.3  Comparison  with  the  Extant  Decision  Model  — — 35 

3.4  The  Contingency  Decision  Model  —————  38 

3.5  Other  Examples  41 

3.6  Summery  — — — ————————  41 

4 THE  CONimGENCX  DECISION  MODEL  ABD  INPORMAIIOM  OVHAMICS  43 

4.1  Purpose  —————————— — — — 43 

4.2  Introduction  —————————————  43 

4.3  Example  IWo  43 

4.4  The  "Outcome**  Swlteh 45 

4.5  Coaperetlve  Beaulte  from  Contingency  Decision 

Making 51 

4.4  ftio  Pexifkerel  Xeaues  ——————  52 

4.7  StfMiy 54 


U1 


P CHAPTER  page 

5 INFOIMATI(»f  REPLENISHMENT 56 

5.1  Introduction — — 56 

’ 5.2  TWO  Examples  - — — - 56 

5.3  Rules  of  Optimality  — — 67 

5.4  Other  Markovian  Distributions 70 

5.5  Repetitive  Decision  Situations  — — - — - — - — 77 

5.6  The  Case  for  Periodic  Replenishment  86 

5.7  Susnary  — 91 

6 RELIABILITY  AND  MAUnAlNABILm  THEORY 92 

6.1  Introduction — 92 

6.2  Definitions — — 92 

6.3  Results  --- — - — ---- — - — — — — - — 93 

6.4  Summary  — - — — 96 

7 COMCLUSIOIS  AND  EXTENSIONS 97 


APPENDIX  A.  Proof  of  Theorenm  2.3  and  2.4 99 

REFERENCES 107 

DISTRIBUTION  LIST 109 

DD  1473  112 


Iv 


FIGURES 


FlRure  page 

2-1  Markovian  infonnation  model  — 7 

2-2  Indexing  schematic  10 

2-3  Rewards  for  Example  1 13 

2-4  V*(n)  , v*(n)  , and  h(n)  as  a function  of  n — 15 

2-5  p(n)  as  a function  of  n 16 

2-6  Reward  structures  — 19 

2-7  General  reward  structure  22 

2-8  Comparison  of  A(n)  and  ^A(n)  27 

2-9  Conq>arlson  of  p(n)  and  ~p(n)  — 28 

2-10  A transient  state  exaiq;>le , 32 

2- 11  6(n)  Md  p(n)  for  the  trenslent  state  example 34 

3- 1  A declslon-maklng  model  -— 36 

3-2  U.S.  stretegy  In  Europe  — 37 

3-3  Two  declslon-maklng  models  ------ — — 39 

3-4  Altered  decision  siodel  — -- — - — 40 

3- 5  Contingency  declslon-maklng  su>del  — '- 42 

4- 1  Model  of  production  runs  44 

4-2  Example  2 variables  - — - — - — — 46 

4-3  Exea^le  2 with  costs  — — — — 47 

4-4  Probability  suiss  function  for  X(m)  — — — - — — 49 

4-5  Representative  probability  mass  functions.  Example  2 -— — — 50 

4-6  Information  acquisition  —————————— — — — 53 

4- 7  Discounted  exasiple  ———————————————  55 

5- 1  Decision/lnfonuition  ecquisitlon  model  ——————————  57 

5-2  Preferred  altei^tlvas  for  Example  4 ———————————  59 

5-3  Quality  of  information  for  two  periodic  acquisitions  — — — 61 

5-4  Dominant  alternatives,  periodic  acquisition.  Example  4 — — 62 

5-5  Decision  occurrence  model  —————————————  64 

5-6  Nat  expected  reward  as  a function  of  cost  ————————  66 

5-7  A dominant  policy  ———————————————  69 

5-8  Increasing  and  decreasing  occurrence  rates  — — — 71 

5-9  Rewards  as  a function  of  cost  for  increasing,  constant,  and 

decreasing  oceurrenea  rates  —————————  73 

V 


Figure  page 

5-10  a ■ 1 and  a ■ 2 regions,  constant  and  increasing 

occurrence  cases — - — 75 

5-11  The  repetitive  decision  siodel 78 

5-12  Equivalent  infoxnation  value  model  — 79 

5-13  The  collapsed  model 81 

5-14  Het  elected  rewards  -- — - — 82 

5- 15  Balationship  of  , (Yj^)  , and  {Z^^} 87 

6- 1  Optimal  region  for  x*  — — 95 


wl. 


TABLES 


TABLE  page 

2-1  Dec i» Ion-Outcome  Results,  Example  1 6 

2-2  Rewards  and  Probabilities,  Example  1 12 

A-1  Reward  Structure,  Exas^le  2 45 

5-1  Summary  of  Expected  Rewards 58 

5-2  Periodic  Replenishments,  Example  4 60 

5-3  Expected  Rewards,  Geometric  Distribution  — 65 

5-4  Optimal  Policy  for  the  Example  Problem 68 


‘ 


▼11 


i 


j 


j 


ACKNOWLEDGMENTS 


The  author  attended  Stanford  University  under  a program  fully 
funded  by  the  United  States  Army.  I express  my  appreciation  to  the 
University  and  the  Army  (in  particular.  Colonel  C.  H.  Schilling, 

Head  of  the  Department  of  Engineering,  U.S.  Military  Academy,  and 
Professor  Vllllam  K.  Llnvill,  Chairman  of  the  Engineering-Economic 
Systems  Department,  Stanford  University)  for  making  possible  one  of 
life's  noble  experiences. 

1 was  particularly  fortunate,  while  at  Stanford,  to  be  associated 
with  the  Decision  Analysis  Group  of  the  Stanford  Research  Institute 
(SRI).  Special  tribute  is  due  Dr.  James  E.  Matheson  for  a peculiar 
courage  in  accepting  a true  "green  horn"  to  be  part  of  his  professional 
and  accomplished  group.  Working  with  Dr.  D.  Varner  North,  Dr.  Allen  C. 
Miller,  and  Dr.  Bruce  R.  Judd  of  SRI  was  particularly  satisfying  and 
motivating. 

Finally,  I siust  single  out  two  people  whose  forebearance  and 
support  have  been  central  to  my  life  in  the  last  three  years.  My  wife 
has  been  a pillar  of  strength  and  encouragement  during  many  hectic  months 
of  thesis  preparation.  She  is  the  epitome  of  Euripides'  thought  that 
"man's  best  possession  is  a sympathetic  wife".  My  advisor.  Professor 
Ronald  A.  Howard,  deserves  and  receives  the  customary  credit  for 
Interesting  me  In  decision  analysis,  stimulsting  my  curiosity,  snd 
encouraging  my  academic  endeavors.  However,  far  beyond  these  sdvlsor 
accolades,  I thank  him  for  being  a friend  when  a friend  was  badly 
needed. 

In  addition,  the  Department  of  Defense  Advanced  Research  Projects 
Agency  (ARPA)  and  the  National  Science  Foundation  provided  financial 
support  for  the  supervision  and  publication  of  this  work. 


vtli 


CHAPTER  1 
INTRODUCTION 


$ 


Man  has  a propensity  to  acquire  and  store  Items  he  will  need  to 
satisfy  future  needs.  History  depicts  prehistoric  man  carefully  col- 
lecting and  hoarding  food,  stone  tools,  and  animal  skins  to  carry  him 
through  an  arduous  winter.  Modem  man  has  perpetuated  this  characterls- 
£ tic.  However,  In  an  age  when  physical  wants  are  more  easily  satisfied 

[y  the  emphasis  has  shifted  from  the  acquisition  of  material  objects.  In- 

I ' 

I'  stead,  on  an  Increasing  scale,  people,  organisations  and  nations  are  col- 

lectlng  Information  as  a hedge  against  tomorrow's  demands.  As  Shublk 
[21]*  notes 

There  is  an  old  saying  In  bridge  that  a peek  Is  worth 
two  finesses.  In  many  instances  the  major  weapon  of 
competition  may  be  special  knowledge  or  Information. 

McDonough  [3]  highlights  the  trend  by  reporting  that  over  of  the 
total  U.S.  Labor  force  Is  engaged  In  clerical  activities;  over 
10,000,000  people  are  directly  concerned  with  the  production  and  pro- 
cessing of  Information;  and  at  least  50Z  of  the  cost  of  running  the 
economy  is  Information  costs. 

The  very  emphasis  on  information  has  led  to  Inevitable  problems 
in  every  ..  sphere  of  modem  life,  the  chronic  condition  is  a sur- 
feit of  information,  useless,  poorly  integrated,  or  lost  somewhere  in 
the  system"  [7].  Hilensky  continues  with  a desiderata  for  information: 
clear,  timely,  reliable,  valid,  adequate,  and  wide- ranging-- the  obvious 
connotation  that  these  are  more  noticeable  by  their  absence  than  by  their 
presence . 

These  problems  arise  in  part  because  organisations  have  not  adopted 
means  to  rationalise  the  information  process.  Decision  analysis,  among 
the  many  quantitative  models  of  decision  making,  siost  explicitly  treats 
tha  valua  of  information  and  provides  a consistent  basis  for  consideration 


*lfua^rs  in  square  brackets  refer  to  the  Bibliography  found  in  rear  of 
tha  thesis. 
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of  the  acquisition  and  use  of  information.  Expository  works  by  Howard 
[14,15],  North  [18],  and  Raiffa  [4],  as  well  as  a recent  dissertation  by 
Miller  [17], are  significant  buttresses  for  a methodology  of  information 
resource  allocation. 

However,  even  these  valuable  contributions  are  silent  on  the  dyna- 
mics of  information.  Implicit  in  many  of  the  qualitative  analysis  of 
information  acquisition  (Wohlstetter  [8],  Wilensky  [7])  and  explicit  in 
criticism  of  national  intelligence  activities  (e.g.,  post  hoc  analysis 
of  the  Berlin  Wall,  let,  and  the  Yom  Rlppur  war)  is  a recognition  of  an 
information  value- time  relationship.  However,  most  quantitative  analy- 
sis of  information  treats  the  value  of  information  as  static,  invariant 
* 

over  time.  This  dissertation,  building  on  the  seminal  foundation  of 
the  previous  cited  works,  analyzes  the  dynamics  of  Information. 

Chapter  2 is  the  framework  for  the  entire  thesis.  We  perhaps  all 
share  an  intuitive  feeling  that  the  value  of  information  decreases  with 
the  passing  of  time.  However,  exactly  what  do  we  mean  by  information 
"perishing"?  Is  this  an  inevitable  phenomenon?  How  do  we  measure  the 
rate  at  which  perishing  occurs?  Is  the  rate  invariant?  What  is  the 
effect,  if  any,  of  risk  aversion  on  this  "perishing"? 

Chapter  2 treats  the  depreciation  of  the  value  of  information  over 
time.  The  phenomenon  is  indeed  inevitable,  and  for  states  of  informa- 
tion that  can  be  modeled  by  a Markov  process  we  have  a handy  benchmark 
for  the  rate  of  perishing.  This  yardstick,  for  the  two-state  case,  is 
related  to  the  "shrinkage"  as  defined  by  Howard  [2].  An  liiq>ortant  re- 
sult is  that  the  value  of  information  "perishes"  at  a rate  equal  to  or 
less  than  the  absolute  value  of  the  largest  "transient"  eigenvalue  of 
the  underlying  Markov  process. 

The  results  of  Chapter  2 have  merit  in  their  own  right.  However, 
an  astute  analyst,  if  he  knew,  for  example,  the  exact  timing  of  a deci- 
sion could  allow  the  necessary  time  for  information  collection,  calcu- 
late the  depreciation  of  the  value,  and  reduce  the  problem  to  essentially 
a static  situation.  This,  of  course,  assumes  he  knows  the  exact  timing 

W , , 

Ransom  [5]  reports  that  strategic  Intelligence  in  wartime  depreciates 
at  the  rate  of  101  per  month.  This  is,  at  best,  an  empirical  observa- 
tion which  lacks  a rigorous  definition  and  quantification  of  value. 
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of  the  Implementation  of  the  decision.  As  illustrated  in  Chapter  3 many 
decisions  of  importance  and  interest  are  implemented  at  an  tmcertain 
time  in  the  future.  We  slightly  alter  Howard's  decision  model  [IS]  to 
introduce  an  element  of  uncertainty  in  the  time  of  occurrence  of  the 
decision.  Incorporation  of  this  probability  into  the  basic  decision 
model  leads  to  fruitful  study. 

In  particular.  Chapter  4 reconsiders  the  rate  of  information  perish- 
ing in  light  of  this  uncertainty.  We  also  treat  intermediate  Information 
acquisition  and  discounting  of  rewards  as  extensions  of  the  basic  results 
of  Chapter  2. 

In  a sense  Chapters  3 and  4 serve  as  a transition  from  Chapter  2 to 
Chapter  5,  a consideration  of  the  appreciation  or  replenishment  of  infor- 
mation. We  illustrate  the  meaning  of  an  optimal  policy  of  information 
acquisition  and  determine  rules  of  optimality  for  single  and  multiple  oc- 
curring decisions.  In  particular  a decision  occurrence  described  by  a 
geometric  probability  distribution  serves  as  a metric  for  other  distribu- 
tions. 

Chapter  6 builds  on  the  results  of  Chapter  5 and  extends  the  tech- 
niques of  information  appreciation  by  utilizing  results  from  the  estab- 
lished theory  of  maintainability  and  reliability.  Several  of  these  well- 
established  results  lead  to  extensions  of  the  original  conclusions  of 
Chapter  5. 

The  final  chapter  summarizes  the  study  and  suggests  areas  for 
further  development  and  research. 

As  noted  previously  this  thesis  fill  a niche  in  a growing  body  of 
work  on  information  value  theory.  The  intelligence  agencies  of  this 
country  as  well  as  analysts  of  many  business  firms  are  faced  with  a 
formidable  resource  allocatidn  problem.  There  usually  exists  a sniltiple 
array  of  collection  devices,  each  with  Its  own  probability  of  acquiring 
various  pieces  of  data.  These  data  in  turn  result  in  different  updates 
of  prior  information  that  influence  one  or  BK>re  of  a compendium  of  deci- 
sions. These  decisions,  likewise,  have  different  associated  costs  and 
benefits  as  well  as  probabilities  of  occurrence. 

One  would  be  both  naive  and  foolhardy  to  claim  at  this  stage  of 
developamnt  a complete  theory  of  information  resource  allocation  that 
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would  aid  these  decision  makers.  However,  the  results  of  this  thesis  are 
a solid  groundwork  for  the  ouch  needed  follow>on  research.  The  defini- 
tion and  concept  of  information  perishing  and  the  revision  of  the  deci- 
sion model  lead  to  results  that  were  previously  tenuously  shared  and 
accepted  by  many  decision  analysts  but  never  precisely  defined.  The 
theory  of  appreciation  and  the  optimal  policies  of  information  acquisi- 
tion are  new  to  infotmation  value  theory  and  presage  even  fuller  exploi- 
tation of  reliability  theory.  While  much  of  the  reliability  work  has  to 
do  with  statistical  inference  and  parameter  estimation  there  is  also  a 
large  body  of  conclusions  concerning  maintainability  and  optimal  replace- 
ment policies.  These  results  have  yet  to  be  fully  mined  for  their  appli- 
cation to  information  perishing  and  replenishment. 

The  ultimate  goal,  of  course,  is  a set  of  allocation  rules  for  the 
intelligence  or  information  decision  maker.  This  thesis  forms  a secure 
stepping  stone  for  reaching  that  goal. 
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CHAPTER  2 

IllFORMATl(»i  DYNAMICS 
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t 


. t 


2.1  Purpose 

This  chapter  examines  the  time  variation  of  the  value  of  information. 
In  particular,  we  define  two  key  concepts,  information  perishing  and  the 
rate  of  information  perishing.  We  then  proceed  to  develop  several  proper- 
ties of  these  ^o  essential  parameters. 

2.2  Introduction 

Most  of  the  expository  discussions  of  decision  analysis  treat  the 
value  of  information  as  a static  quantity  [14,15,19].  Howard's  well- 
known  bid  problem  [14],  as  an  example,  computes  the  expected  increased 
profit  to  the  bidder,  given  clairvoyance  or  perfect  information  about 
his  own  cost,  to  be  1/96  units.  However,  one  may  consider  two  extremes. 

If  the  clairvoyant  delivers  the  perfect  information  too  late  for  the 
bidder  to  incorporate  the  data  into  his  bid,  then  the  expected  increase 
in  profit  is  surely  not  1/96.  Conversely,  one  may  also  argue  that  if  the 
bidder  receives  the  information  much  earlier  than  the  date  of  the  bid, 
he  may  feel  that  changing  environmental  factors  would  affect  the  validity 
of  the  information.  Therefore,  the  expected  increaaed  profit  of  1/96  is 
in  a sense  a conditional  value--a  value  that  is  correct  if  the  informa- 
tion is  "timely"  and  "fresh." 

fie  may  illustrate  the  dynamics  of  the  value  of  the  state  of  informa- 
tion with  an  exasq>lc. 

2.3  Example  I;  A Two-State  Markovian  Case 

fie  choose  the  simplest  of  examples  where  the  decision  maker  can 
choose  eitber  state  "1"  or  state  "2."  When  the  true  state  of  nature  is 
subsequently  revealed,  ha  receives  a greater  reward  if  he  has  correctly 
chosen  the  state  and  a lesser  reward  (perhaps  a cost)  for  an  Incorrect 
choice.  His  state  of  information  is  described  by  a Markov  process. 

Although  not  critical  to  the  discussion  we  could  suggest  that  the 
situation  represents  such  real-lifS  decisions  as  stockage  of  item  1 or 
item  2 whsre  financial  or  storage  constraints  limit  the  seller's  choice 
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to  one  or  the  other  Item;  defense  of  Area  1 or  Area  2 against  repetitive 
enemy  attacks  where  the  small  size  of  the  defending  force  or  a lack  of 
transportation  precludes  defense  of  both  areas;  or  even  the  "pea  In  a 
shell"  game  at  the  local  carnival. 

We  precisely  define  the  situation  aa: 

1.  The  decision  maker  can  choose  state  1 or  state  2 but  not  both. 

2.  A Markovian  model.  Fig.  2-1,  represents  the  model  of  his  Informa- 
tion on  state  occupancy. 

3.  The  decision  maker  can  change  his  decision  prior  to  each  tran- 
sition. However,  he  does  not  observe  the  process  at  any  time. 

In  other  words,  he  makes  a series  of  decisions,  e.g.,  1,  1,  2, 

1,  ...,  2,  etc.,  and  at  the  end  of  the  game  Is  given  some  re- 
ward contingent  on  the  number  of  correct  decisions. 

4.  The  dec Is Ion- outcome  matrix  Is  shown  In  Table  2-1. 


TABLE  2-1 

Decision- Outcome  Results,  Example  1 


Decision 

Choose  State  1 

Choose  State  2 

2 

a 

u 

State  1 

+100 

-100 

1 

State  2 

-100 

+100 

2.4  The  Optimal  Decision  with  Only  Prior  Knowledge 
Ut 

6(m)  - 6^^^ 

be  the  decision  to  choose  state  I at  transition  ■ , end 

6(m)  - 


be  the  decision  to  choose  state  2 at  transition  ■ . Assume  the  gaam  or 
decision  process  lasts  for  N transitions.  The  decision  amker  must 
a priori  make  a sarias  of  M decisions  {6(n)}  ■ (6(0), 6<M)} 

such  as 


(6(0)  - - 8^*^ 8(M)  - 


6 


f 

! 

f 

T 

T 

n state  1 at  n I state  1 at  n-1,  c}  0.1 

^ state  2 at  n i state  1 at  n-1,  c}  « o.p 

^ state  2 at  n I state  2 at  n-I,  c)  > 0.2 

state  1 at  n I state  2 at  n-1,  c}  « 0.6 

Figure  2-1  Hsrkovlan  Infonution  aodel 
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His  prior  knovledge  is  contained  solely  in  the  Markov  model  of  Fig.  2>1. 
Therefore,  he  would  rationally  calculate 

Pr{s(0)-lle)  - Tl^  - 8/17 

and 

(3r(s(0)-2|c}  ■ “ 9/17 

From  Table  2-1  we  may  calculate  the  expected  reward  at  transition  m - 0 , 
conditioned  on  the  choice  of  , as  an  exas^le,  as 

<v(0)l6(0)-6^^\c>  - tT^(O)  1^(0) 

I I 

- - 5.88 


(2.1) 


(2.2) 

(2.3) 


(2.4) 


or 

64f(si)  - 6^*^  , ■ - 0,1.2,..., M 

Xha  optlsMl  daclsion,  in  effect,  is  no  more  then  the  optlml  choice 
of  e colunn  frosi  the  reward  ■etrix  of  Table  2-1.  Correepondlng  to  this 


In  general,  the  expected  reward  at  any  transition  is 

<v(m)l6(m)-6^‘'\e>  - n^(n)  , k - 1,2 

i 

However,  in  the  example,  with  only  prior  knowledge 

■r^(m)  - ^^^(w)  - 

and 

<v(m))&(m)  - ^ ■ 1»2 

i 

The  optiswl  decision  is  defined  by 
6*(n)  - <v(m)|6(m)-6^'‘\e> 


i 


optimal  decision  Is  a reward 


<v(m)  1 6*(m)  , e>  • <v(m)  \ . «>  ■ 5.88 

Some  compactness  In  notation  is  achieved  by  defining 

- <v(m)  |6(m)-6^’‘\e>  (2.5) 

and 

v*(m)  - <v(m)|5(m)-6*,o  (2.6) 

The  decision  maker's  expected  future  reward  is  also  of  interest. 

He  will  use  "n"  to  index  periods  remaining  and  define  the  expected  fu- 
ture rewards  with  n time  periods  remaining  as 

<v(n) |6*(n).«>  (2.7) 

where  6*(n)  implies 

{6*(n)  - 6*,6*(n-l)  - 6* 6*(1)  - 6*}  (2.8) 

In  the  exaa^le  the  optimal  decision,  as  noted,  is  for 

every  transition.  Therefore,  the  expected  future  reward  has  a particu- 
larly simple  form 

<V(n) I 6*(n) , e>  ■ <v(n) | 6*(n)»6^^^ , c>  • n(5 .88)  , 

n - M,H-1,M-2,...,2,1,0  (2.9) 

This  "rasip"  is  plotted  in  Fig.  2-3e  for  M <■  10  . 

Again  compactness  is  realised  by  defining 

v*(n)  • ^(n)|6*(n),o  (2.10) 

for  the  expected  future  reward  conditioned  on  the  decision  maker  elect- 
ing the  optiaiel  decision  at  each  transition. 

(The  Indices  "n"  for  periods  to  go  and  "a"  for  periods  past  imply 
that  ■ M for  a process  with  horison  N . See  fig.  2*2.) 

He  may  siamarise  the  example.  Based  on  a prior  state  of  knowledge 
contained  in  the  Markov  modal  of  fig.  2*1  the  decision  smker  should 
choose  state  2 for  the  entire  sequence.  Bis  expected  reward  per  transi- 
tion is  45.88. 
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2.5  Optimal  Strategy  with  Perfect  Information 

Hhat  changes  would  the  decision  maker  effect  If  he  were  to  receive 
perfect  Information  on  the  initial  state  (all  other  assumptions  of  the 
example  remaining  the  same)? 

We  may  define  the  expected  reward  at  transition  m given  the 
starting  state  1 as 

<v(m)|6(m)-6*,8(0)«i,e>  , 1-1,2  (2.11) 


or  coiig>actly  as 


v^(0)(®)  “ <v(m) |6(m)-6*,s(0)-l,e>  (2.12) 

The  equivalent  relationships  for  expected  future  rewards  are 

<v(n)  \ ^6(n)^  -6*,s(0)-i,e>  , t - 1,2  (2.13) 

and 

vJ(o)(n)  - <y(a){^6(n)^  -6*.s(0)-l,  e>  (2.14) 

Finally,  for  perfect  information  at  time  m - 0,[FI(0)]  , the  expected 
reward  at  any  transition  is 

<v(m)  I 6(®)-6*',FI(0)  , e> 

- n^<v(m)|6(m)-6*,s(0)-l,e>+  T^<v(m)  |6(m)-6*,s(0)-2,€>  (2.15) 

We  economise  further  on  notation  by  writing 

" i!l(0)<“>  • I ^'t(0)<“>  • ^ “ ^*2  <2-l6) 

i 

Analogously  we  have  for  expected  future  rewards 

W*(n)-^  ^'t(0)<"^  (2.17) 

i 

Wi  may  use  these  results  and  the  usual  Harkov  matrix  mechanics 
(Howard  [2])  to  ealeulata  the  value  of  perfect  information  as  shown  in 
Table  2*2.  The  values  in  columns  (5)  and  (9)  ate  plotted  as  Fig.  2>3b. 
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Figure  2-4a  shows  the  expected  future  reward  conditioned  on  receipt 
of  perfect  information  at  m * 0 and  the  expected  future  rewards  condi- 
tioned on  only  the  prior  state  of  knowledge.  Figure  2-4b  indicates  the 
difference  between  these  two  quantities 


A(n)  ■ V*(n)  - v*(n) 


(2.18) 


Examination  of  Fig.  2- 4b  reveals  that  although  the  perfect  information 
acquired  at  m - 0 initially  places  the  decision  maker  in  a relatively 
favorable  position  this  advantage  diminishes  over  time  and  by  the  eighth 
transition  the  advantage  has  disappeared.  This  decrement,  which  we  shall 
shortly  define  as  Information  perishing,  has  a natural  interpretation  in 
terms  of  response  time.  If  the  decision  maker  requires  one  period  to 
adjust  his  strategy  to  the  receipt  of  perfect  information  at  transition 
aero,  the  value  of  this  clairvoyance  is  171.98;  if  he  requires  over  eight 
periods  to  react,  then  the  Information  has  no  value. 

The  rate  of  decline  of  this  relative  advantage  is  also  of  inter- 
est. He  define 


p(n)  ■ for  Mn)  ^ 0 

■ 0 for  a(n)  ■ 0 


(2.19) 


This  quantity  is  plotted  In  Fig.  2-5. 

2.6  Basic  Definitions 

The  phenomenon  of  the  degradation  of  the  value  of  information  over 
time,  while  apparently  a characteristic  of  many  real-life  decision  prob- 
lems, is  not  extensively  treated  in  the  literature.  North  [18], 
Smallwood  [22],  and  Howard  [13]  discuss  aspects  of  inference  in  a dynamic 
situation  while  Robinson  [20]  reports  on  the  practical  difficulties  of 
estimating  time  varying  probabilities.  However,  these  articles  are 
limited  to  problems  of  inference  without  consideration  of  the  value  of 
the  information.  The  concept  of  information  "perishing"  appears  more 
general  and  powerful  than  implied  by  this  literature. 

Me  as  a first  step  must  agree  on  a definition  of  information 
"perishing."  Information  may,  of  course,  evolve  over  tiam  without 
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rituM  2*4  %^(n)  , v*(n)  , and  4(n)  as  a function  of  n 
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affecting  the  choice  of  decisions.  As  an  example,  suppose  there  exists 
some  vector  valued  state  of  information  s which  is  a function  of  time. 
Let  s(t)  represent  this  functional  dependence  and  assume  that 
s(t)  cS  , a set  of  possible  states  of  information.  Then  if  6*  • 
for  all  s(t)  c S , that  is,  the  optimal  decision  is  the  same  for  all 
states  of  information,  would  one  characterize  information  acquired  at 
m > 0 as  perishing?  or  is  this  instant  perishability? 

As  a second  example  we  consider  the  case  where  the  decision  maker 
receives  clairvoyance  at  m • 0 and  also  at  m ■ , m^^  > 0 . Although 

ve  shall  analyze  this  situation  in  seme  detail  in  Chapter  4 it  is  perhaps 
intuitively  obvious  that  the  second  acquisition  of  clairvoyance  "wipes 
out"  the  value  of  the  first  disclosure  of  perfect  information.  Is  this 
information  perishing? 

We  precisely  define  information  perishing.  l*t  N^(n)  be  the  ex- 
pected future  rewards  with  n periods  to  go  conditioned  on  acquisition 
of  information  (perfect  or  imperfect)  at  m ■ 0 . l^t  v*(n)  represent 
the  expected  future  reward  based  solely  on  prior  knowledge  at  m ■ 0 . 

Let  A(n)  ■ v*(n)  - v*(n)  . If  A(n)  Is  a non- increasing  function  of 
n without  benefit  of  test,  observation,  experiment,  or  other  information 
acquisition,  then  the  information  acquired  at  m ■ 0 is  perishing. 

If  A(n)  ■ 0 , the  Information  has  perished.  The  rate  of  perishing, 
p(n)  , is  defined  by  (2.19),  i.e.. 


p(n) 


, Ms-JI 

A(n) 


2.7  Generalizations 

We  now  rigorously  prove  several  properties  of  information  perishing 
and  the  rate  of  perishing. 

2.7.1  The  Reward  Structure 

We  have  previously  defined  the  expected  reward  at  any  transition  a 

as 

<v(m)l5(a)-6^'"\e>-  ^ <2-20> 

1 

There  are  two  other  forms  of  this  eiqiresslon  that  will  be  useful. 
The  first  Is 
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T 

<v(m)  |6(m)-6^*‘\e>  - ,Ti(0)  [P]“  (2.21) 

!— I \ 1 I 

The  second  results  from  the  expansion  of  [P]™  Into  a series  of  N dif- 
ferential matrices  as 

[P]“  - [Qq]  + >J[Qi]  + + ...  + >S-lfVl^  ^2.22) 

Substituting  (2.22)  Into  (2.21)  yields 

<v(m)  |6(m)-6^‘‘\c>  - x“  + . . . + (2.23) 

Figure  2-6a  plots  the  reward  structure  of  the  two-state  example  we  have 
been  considering,  while  Fig.  2-6b  represents  a general  two-state  7i  deci- 
sion model.  The  extension  to  D state  Is  obvious  but  not  representable. 

2.7.2  The  Inevitability  of  Information  Perishing 
We  have  seen  In  the  single  example  that  information  perishes.  How- 
ever, we  can  establish  this  result  for  a far  more  general  case. 

Theorem  2.1 

For  any  N-state  Markov  decision  process  where  the  decision  maker 
may  choose  both  the  transition  matrix 

the  reward  from  seme  constant  reward  matrix  [R]  * (R 

1 contained  in  Index  sets,  K and  L of  decisions  and  transition 
matrices,  respectively),  A(n)  Is  a non- Increasing  function  of  n . 
Proof 

Let  as  an  example,  represent  the  decision  to  choose  the 

first  transition  matrix  and  the  first  column  of  the  reward  SMtrix. 
There  are  three  cases  to  be  considered: 

Case  1.  We  may  consider  first  the  trivial  case  of  some  decision, 

say  being,  completely  dominant,  l.e.,  both  v*(m)  and 

v*(m)  imply  6*  • for  all  m . Therefore,  Lin)  • v*(n) 

. > v*(n)  > 0 , and  ^(a)  Is  obviously  non- Increasing. 

Case  2.  Partial  dominance  may  exist  in  the  sense  that  v*(iO  and 

v*(m)  both  laq>ly  6*  ■ for  some  mam  .If  this  be 

cr 

true,  and  If  n - m^^  , than  ^(n)  ■ 0 agalB,  and  the  theorem 
Is  true. 

Case  3.  The  Interesting  ease  Is  the  case  of  no  dominance.  Ms  pro- 
ceed by  Induction.  With  one  tins  period  remaining,  to  show  that 


18 


L(,n)  is  s non- Increasing  function  of  n is  equivalent  to  showing 


A(l)  - A(0)  a 0 

However, 

a(l)  - UO)  - [v*(l)  - v*(l)]  - [v*(0)  - v*(0)] 
- [v*(l)  - v*(0)]  - [v*(l)  - v*(0)] 

Counting  forward  we  may  write  (see  Pig.  2-2) 
v*(l)  - v*(M-l)  + v*(M) 

and 

v*(0)  - v*(H) 

Similarly  we  may  express  the  other  two  terms  as 
v*(l)  - v*(M-l)  + v*(M) 

and 

v^CO)  ■ v*(M) 


(2.24) 


(2.25) 

(2.26) 

(2.27) 

(2.28) 


Performing  the  obvious  subtraction  we  can  express  (2.24)  as 

4(1)  - 4(0)  - v*(H-l)  - v*(M-l)  (2.29) 

which  is  obviotisly  greater  or  equal  to  sero,  Assune  the  induction 
hypothesis  holds  for  h-l  time  periods.  It  remains  to  show  that 
the  theorem  holds  for  n time  periods  to  go,  or  that 
4(n)  - 4(n^l)  h 0 . 

4(n)  - (>(n-l)  + v*(Tj]  - [v4(n-l)  + v*(Tt))  (2.30) 

where  "rf',  countlne  forward,  is  the  transition  at  which  there  are 
n transitions  to  so.  Therefore, 

4(n)  ■ 4(n-l)  + v*(Ti)  - v*(Tt) 
or 

4(n)  . 4(i»-l)  • v*(r])  • v*(t^  k 0 
which  coeipletes  the  proof. 
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Ht  My  axtand  these  results  by  consideretion  of  e continuous  time 
process.  Me  shell  use  "t"  to  Indicete  time  starting  from  time  "zero'* 
and  "t"  to  indicate  tiM  to  go.  We  shall  assume  a constant  but  com- 
pletely gaoeral  generation  of  rewards  as  shown  in  Fig.  2-7. 

To  parallel  (2.18)  we  define 

b(t)  ■ I v*(t)  dr  - [ v*(t)  dT  (2.31) 

t**  t** 


.f 


[v*(t)  - V*(t)]  dT 
To  show  that  the  infonaation  is  perishing  we  show  dMt)/d(t)  s 0 , or 

(2.32) 

[v*(t)  - v*(t)] 


j^[V*(T)  - V*(t)]  dT  « 0 


£ 0 for  all  t j:  T 


Thus*  we  see  that  in  a decision  process  that  continues  over  some 
period  of  tiM  that  any  infonaation  is  perishing.  We  esiphasize  this 
result  by  stating  Theorem  2.2. 

Theorem  2.2 

All  infonaation  is  perishing  (assuming  the  reward  structure  is 

constant  over  tiM) . 

2.7.3  The  Rate  of  InforMtion  Ferishing 

a.  Introduction.  Me  have  noted  in  fig.  2-5  that  the  rate  of  in- 
foiMtion  perishing  as  defined  by  (2.19)  ms  always  less  than  0.7,  the 
absolute  value  of  the  transient*  eigenvalue.  Is  this  result  always 
trust 

b.  Initial  lesult.  Ns  My  show  that  this  result  holds  not  only 
in  tbs  asaaiplo  but  in  a far  more  general  case. 

The  case  m shall  coMider  is  this: 

(1)  V*state  process 

(2)  k decisions  possible  with  reward  Mtrix 


Transient  connotes  eigenvalues  not  equal  to  om. 
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(0) 

(1) 

(k-l) 
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(0) 

(i) 

(k-l) 

‘^n 

IT  • • « e 

n 

n _ 

(2.33) 


Let  ^ * ganerel  element  of  [R]  . 

(3)  The  transition  matrix  [P]  is  not  part  of  the  decision.  In 
other  words,  [P]  is  invariant. 

(4)  The  decision  maker  receives  perfect  information  at  transition 
sero,  but  there  is  no  observation  or  information  after  this. 

(5)  The  decision  maker  has  no  risk  aversion. 

(6)  A decision  is  possible  at  each  transition. 

Me  shall  prove  that  p(n)  ^ |x^(  where  is  the  maximum  in  ab- 

solute value  of  transient  eigenvalues  associated  with  the  transition  ma- 
trix [P]  . 


Theorem  2.3 

For  the  n-state  Markov  process  with  k reward  decisions  and  an 
invariant  transition  matrix,  p(n)  < |Xj^|  , the  absolute  value  of 
the  largest  transient  eigenvalue. 


The  proof  of  this  theorem  is  of  such  length  that  it  is  reserved  to 
Appendix  A. 

c.  An  Kxteneion.  Me  had  limited  the  previous  proof  to  decision 
situations  where  the  decision  was  limited  to  e choice  of  state,  and  the 
transition  matrix  was  invariant.  However,  we  may  also  extend  the  result 
to  the  situation  where  the  decision  mekar  may  elect  not  only  the  reward 
structure  but  also  the  transition  matrix. 

For  a n-state  Markov  process  let  k CK  represent  an  index  set  of 
rswerd  decisions  and  1 CL  represent  an  index  set  of  transition 
matrix  dacislDns.  Then 
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p(n)  s Ix^l' 

where  jX^j  -max  being  the 

greatest  absolute  value  of  the  transient  eigenvalues  of  . 

The  proof  of  this  theorem  will  also  be  found  In  Appendix  A. 

d.  The  Acceleration  of  Information  Perishing.  Figure  2-5  shows 
that  p(n)  Is  a decreasing  function  of  n or  that  Information  perishes 
more  rapidly  with  the  passage  of  time.  <fe  may  show  that  this  Is  a gene- 
ral result  for  those  decision  situations  where  the  transition  matrix  Is 
Invariant.  We  first  need  to  prove  a lemma  concerning  the  reward  struc- 
ture. 


For  a N-state  Markov  process  where  the  decision  maker's  alternatives 
are  limited  to  choice  of  columns  from  the  reward  matrix  there  exists 
for  some  starting  state,  say  s(0)  ■ 1 , at  most  three  optimal  poli- 
cies. Further,  If  all  the  eigenvalues  are  positive,  there  exists 
at  most  two  optimal  policies. 

Proof 

We  use  (2.22)  to  write 


N-1  N 

<v(*0l*(0)-i>  - max  ^ ^ (2.34) 

J-0  >1 

where  X^  ■ 1 . Let  M -»  • so  that  ® » J ^ 0 . Obviously, 

H 

<V(M)  ls(0)-l>  - max  ^ 

^ 1-1 


the  "stationary"  policy  noted  In  Howard  [2].  This  is  the  first  of 
the  two  or  three  policies.  Now  aaeume  X.  > 0 , all  J . We 

J fkS 

represent  the  scalar  product  of  (2.35)  by  jC'  ' so  that 

M-1 

<v00  |e(0)-l>  - max  ^ x”  (2.36) 

J-0 

Assum  that  k ■ 1 for  the  trensikion  and  that  for  the  N ^ a 
transition,  •llaar<m,  k-2.  for  tha  two  daclslons  must 
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differ  in  at  least  one  term.  We  will  let  j • 1 be  that  term.  By 
the  assumed  optimality 

J-O  J"0 

JI*! 

N-1 

^ ,Mfo  ^(2)  ^ V .M  -(1)  ,, 


^r“  l"^'^  I 


.>tfa  -(1)  . .Mfa  A2y 

\ iC  i jC 


C<1>  < c<2) 

iC  « 


(2.39) 


(2.«)) 


However,  if  a , then  decision  1 would  be  improved  by 

switching  to  decision  2 as  all  the  other  C's  are  the  same,  and 
all  the  eigenvalues  are  positive.  Therefore,  there  cannot  be  two 
decisions  that  are  optimal  for  the  different  transitions.  Similar 
reasoning  prevails  if  Xj  s 0 for  some  J except  now  the  optimal 
decisions  H>ay  switch  from  odd  to  even  transitions. 

This  completes  the  proof  of  the  lemma  and  allows  us  to  state  the 
following  corollary. 

Corollary  2.1 

For  the  N- state  Markov  process  with  an  invariant  transition  matrix 


0 2 2 


(2.41) 


Froof 


The  corollary  requires  that 


A*(n-1)  » h(n)  Mn-2) 


(2.42) 


The  proof  follows  by  Induction  on  n using  the  expressions  for 
A(n)  , 4(n-l)  , and  4(n-2)  developed  in  (4.6),  (A.7),  and  (4.26). 
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Then  Iiemne  2.1  makes  possible  term  by  term  comparisons.  The  details  are 
an  exercise  in  tedium  rather  than  enllghtment  and  are  omitted. 

2.7.5  The  Effect  of  Risk  Aversion 

a.  Definitions.  To  this  point  we  have  tacitly  assumed  that  the 
decision  maker  based  his  decision  on  expected  values.  A logical  next 
step  is  consideration  of  the  effects  of  risk  aversion.  We  shall  limit 
the  discussion  to  exponential  utility  functions. 

A natural  extension  of  the  defining  equation  for  A(n)  [(2.18)]  is 


A(n)  » v*(n)  - ^*(n) 


(2.43) 


where  ~v*(n)  represents  the  certain  equivalent  with  n periods  to 
go  conditioned  on  receipt  of  perfect  information  at  m > 0 , and 
^^Cn)  represents  the  certain  equivalent  with  n periods  to  go  based 
solely  on  prior  information.  Howard  and  Matheson  [16]  have  shown  that 
the  "delta  property"  of  the  exponential  utility  function  allows  summa- 
tion of  the  certain  equivalents. 

Analogous  to  Eq.  (2.19)  is 


-p(n)  - ZMS-H  , ^ 0 

A(n) 


(2.44) 


Assuse  that  the  decision  maker  in  the  basic  ex- 


ample has  a risk  aversion  coefficient,  y ■ 0.001  . We  may  calculate 
~A(n)  and  '^p(n)  which  are  plotted  in  Figs.  2-8  and  2-9  (along  with 
the  comparable  val\ies  of  A(n)  and  p(n)  ). 

c.  Generalisations . Comparisons  of  ~A(n)  and  A(n)  and  '^p(n) 
and  p(n)  for  the  general  Markov  case  are  made  possible  by  use  of  the 
approximation 


"vCm)  w v(m)  - Y v(a) 


(2.45) 


where  vCm)  , v(m)  , end  v(a)  represent  the  eerteln  equivalent. 


ween,  and  verlence,  reepeettwely,  of  the  profit  lottery  on  the  m tran- 
sition. (The  approximation  results  in  an  error  of  leas  than  0.2X  in  the 
example.) 


Approximation  (2.45)  allows  (2.43)  to  be  rewritten  as 
M 

~A(n)  - ^ {~v*(4)  - \ Y + 2 P*(jO  J)  (2.46) 

♦iM-n 

M 

- ^ {'V(i)  - v*(jO  + |y  [^(jO  - >(i0}  (2.47) 


jWl-n 


6(n)  + ^ Y [^H£>  - '^(jO] 


(2.48) 


>M-n 


For  a "symmetric"  reward  matrix  of  the  form 


+r  -r 
-r  +r 


(2.49) 


the  variance  with  Information  Is  less  than  the  variance  without  Informa- 
tion and  we  conclude  the  following  theorem. 

Theorem  2.5 

For  a syimetrlc  reward  matrix  '^A(n)  2 Mn)  . (For  a general  re- 
ward matrix  one  may  construct  counter-examples  to  Theorem  2.5.) 

We  may  also  show 
Theorem  2.6 

For  the  aynmetrlc  reward  matrix  ~p(n)  i p(n)  . By  the  use  of 
(2.48)  we  may  write 


Un-1)  ■¥  ^ ^YpJ^CiO  - XiO] 

MKa-l) 

M 

I 2 J 


(2.50) 


We  may  simplify  this  expression  considerably  by  letting 
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» 


^(X)  ■■  S (as  the  variant  without  Infomatlon 
Is  a constant) 

'X>H£>  - S(jO  . and 

L * transition  with  n to  go,  and 
V(L)  - v*(I*)  - V* 


(2.51) 


Substituting  these  Into  (2.50),  cross-multlplylng,  transposing  and 
slfltpllfylng  yields 

M M 

Mn)  ^ fS-S(jO]  s A(n-l)  ^ [S-S(0]  (2.52) 

jH*-(n-l)  jHl-n 

M 

[V(L)  + Mn-1)]  ^ [S-S(i)] 

jW*-(n-l) 

M 

i A(n-l){s-S(L)  + ^ [S-S(j0]|  (2.53) 

jHI-  (n- 1) 


V(L)(n-l)S  + A(n-1)S(L)  S A(n-l)S  + V(L)  ^ S(i)  (2.54) 
Dividing  by  V(L)  S(L)  results  in 


I S(iO 

(n-l)S  , A(n-l)  . Mn-DS  . lF«-(n-l) 

S(L)  V(L)  V(L)S(L)  S(L) 


(2.55) 


V(L)  L S(L)J  ‘ 


^ 8(j0  - (n-l)S 


.jWt-(n-l) 


(2.56) 


Ato-Jl  c CPri)  yOi)  , n.i 

V(L)  V(L)  " ^ 


(2.57) 
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ve  may  prove  the  inequality  by  proving 


M 

I S(<) 

.jWKn-1) 

- (n-l)S 

S(L) 


(2.58) 


or 


1 • ■■  a 
^ S(L)  ^ 


r M 1 

Y liA 

L n-l 

■B-M-(n-l) 

- S 

S(L) 


(2.59) 


and 


M 


iWl-(n-l) 


M 


I ^ 

je-M-(n-l) 


(2.60) 


(2.61) 


But  S(L)  is  the  minimum  variance  so  that 


M 

S(L)  S(L)  s ^ ^ (2.62) 

i^(n-l) 

This  coD^letes  the  proof. 

2.7.6  Transient  Processes 

The  previous  examples  involved  only  Markov  chains  with  recurrent 
states.  We  briefly  digress  to  consider  tbe  transient  chain  shown  in 
Fig.  2-10.  We  shall  assisee  a reward  matrix 


"+100 

-100 

-ioo“ 

[Rl  - 

-100 

•flOO 

-100 

-100 

-100 

+100 

If  the  process  had  run  for 

acme 

length  of  time 
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In  this  case  5*(m)  - 6'  ^ , and  v*(m)  - O(-IOO)  + O(-IOO)  + 1(100) 

■ 100  . h(n)  and  p(n)  are  both  trivially  equal  to  sero. 

We  may  create  a more  Interesting  exasq>le  by  assuming  that  the 
process  has  Just  begun  and  that  some  outside  probability  mechanism  such 
as  the  flip  of  a fair  coin  determines  if  state  1 or  state  2 is  the  ini- 
tial state.  In  other  words,  as  shown  in  Fig.  2-10, 

P{s(0)-lle}  - P{s(0)-2le3  - 0.5 

h(n)  and  p(n)  are  plotted  in  Fig.  2-11.  The  figure  confirms  that 

A(n)  is  a decreasing  function  of  n and  p(n)  < ■ *H).7  . 

2.8  Sumnary 

This  chapter  has  developed  the  fundamental  concepts  and  results 
necessary  for  an  understanding  of  the  dynamics  of  the  value  of  informa- 
tion. The  most  ijiq>ortant  result  was  the  inevitability  of  information 
perishing.  Equally  significant  is  the  result  that  the  value  of  informa- 
tion for  a Markov  process  perishes  at  a rate  that  exceeds  the  shrinkage 
of  the  underlying  process.  The  chapter  also  considered  the  effect  of 
risk  eversion  where  the  utility  function  can  be  modeled  by  an  exponen- 
tial expression.  The  following  chapter  extends  these  results  by  a slight 
alteration  of  the  basic  decision  model. 
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CHAPTER  3 
THE  DECISION  MC»EL 


3.1  Purpose 

This  chapter  describes  the  decision  siodel  that  will  be  used  through- 
out the  remainder  of  the  thesis. 

3.2  An  Historical  Example 

During  the  1960s  the  United  States,  as  a portion  of  its  NATO  strat- 
egy, pre-stockad  the  equipment  for  several  U.S.  Army  divisions  in  Western 
Europe.  This  equipment  was  matched  to  designated  units  based  within  the 
United  States.  The  anticipated  ntode  of  employment  was  an  airlift  of  per- 
sonnel to  Western  Europe,  "marrying  up"  with  the  equipment,  and  subsequent 
deployment  in  defense  of  NATO  allies.  The  motivation  for  this  plan  was 
to  cut  the  reaction  time  in  countering  any  Russian  agress^-on.  The  concept 
was  tested  during  the  1960s  in  a series  of  exercises  dubbed  "Reforger." 

3.3  Comparison  with  the  Extant  Decision  Model 

A comparison  of  this  strategy  with  the  "usual"  decision  model  re- 
veals some  subtle  differences. 

The  existing  model  [15,19],  depicted  in  Fig.  3-1,  implicitly  recog- 
nises a random  event,  "A  decision  is  needed."  The  entire  analysis  and 
interest  then  follows  this  random  event.  There  is  no  subseouent  uncer- 
tainty concerning  the  occurrence  of  the  decision. 

In  the  cited  historical  example  there  is  some  probability  that  the 
Russians  will  never  attack  Western  Europe  and  that  a decision,  in  the 
sense  of  tactical  deployments,  will  never  be  smde.  The  U.S.  strategy  in 
Europe  is  assuredly  a cpmplex  set  of  si^portlng  decisions.  However,  the 
essence  of  the  approach  Is  shown  in  Fig.  3-2.  The  significant  diffarence 
is  the  recognition  of  uncertainty  In  the  occurrence  of  the  ultimate  deci- 
sion (the  method  of  defendlne  Eurooel. 

This  leads  to  those  mstaphores: 

!•  Beaction  decision  maklna.  The  decision  maker  sets  his  decision 
vector  after  the  need  for  a decision  is  recognised  as  a car- 
tainty  or  near- certainty. 
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Decision  needed?  ' Decisions  Outcomes 
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Figure'  9-1  A docieion-aeking  model 
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2.  Contingency  decision  making.  The  decision  maker  partially 
or  completely  sets  his  decision  vector  before  the  need  for  a 
decision  is  certain. 

Figures  3- 3a  and  3- 3b  illustrate  the  two  concepts  and  suggest  a 
fundamental  hypothesis:  The  set  of  alternatives  available  to  the  contin- 
gency decision  maker  is  at  least  as  great  if  not  greater  than  the  set 
available  to  the  reaction  decision  makei. 

3.4  The  Contingency  Decision  Model 

Figure  3-2  does  not  completely  tell  the  story  of  the  European  pre- 
stock strategy.  As  we  noted  in  describing  the  exan^le  the  United  States 
periodically  tested  the  plan,  incurring  some  costs.  In  addition,  the 
type  and  amount  of  pre- stocked  supplies  might  yary  depending  on  the  U.S. 
state  of  information,  and  the  final  decision  is  obviously  a function  of 
this  initial  decision.  These  nuances  are  depicted  in  Fig.  3-4. 

We  will  find  it  helpful  in  our  subsequent  analysis  to  characterise 
the  event  "Russian  Attack"  as  a binary  "outcome  switch."  In  the  "on" 
position  the  decision  maker  coiiq>letes  his  decision,  if  necessary,  and  re- 
ceives the  reward  from  his  lottery.  In  the  "off"  position  the  decision 
maker  does  not  receive  the  outcome  of  his  lottery  but  recycles  to  recon- 
sider his  pre-set  decision. 

The  setting  of  the  outcome  switch  aiay  be  affected  by: 

1.  Competitive  or  Gaming  Factors. 

Exaaq>le:  The  deployment  of  U.S.  troops  is  contingent  on  the 
exact  timing  of  the  Russian  attack. 

2.  Environmental  Factors. 

Example:  The  decision  maker  will  buy  a new  car  when  his  present 
one  requires  a new  motor. 

3.  Factors  within  the  Control  of  the  Decision  Maker. 

Example:  The  decision  maker  will  buy  a new  car  in  1976. 

4.  A Coad>in>ition  of  Previous  factors. 

Example:  The  decision  maker  will  buy  a new  car  in  1976  unlass 
his  present  one  requites  a new  motor  prior  to  that 
data. 

There  are  also  situations  whaie  tha  dacision  may  ba  repetitive,  and 
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Figure  3-3  IWo  decision-making  models 
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the  decision  maker  would  recycle  to  reconsider  his  decision  after  receiv- 
ing the  results  of  an  outcome  lottery. 

^ Figure  3-5  reflects  these  concepts  and  Is  the  decision  model  that 

will  be  used  throughout  the  remainder  of  this  study. 

3.5  Other  Examples 

f Vfe  have  concentrated  generally  on  one  decision,  the  U.S.  pre-stock 

of  military  equipment  In  Europe.  However,  other  examples  of  contingency 
decision  making  abound.  These  would  Include: 

1.  Military  and  industrial  "Intelligence"  collection  decisions, 
f 2.  Put  and  call  market  operations. 

3.  Many  RGd)  decisions. 

While  any  broad  generalization  Is  dangerous  a common  thread  Is  a desire 
to  cut  reaction  time  when  a decision  Is  needed.  As  a consequence,  some 
r Initial  preparation  Is  accomplished  before  the  final  decision  Is  taken 

as  a certainty. 

3.6  Sianmarv 

This  chapter  has  described  the  decision  model  that  will  be  used  for 
subsequent  analysis. 
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Figure  3-5  Contingency  declslon-maklng  model 
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CHAPTER  4 

, THE  CONTINGENCY  DECISKW  MODEL  AND  INFORMATION  DYNAMICS 

4.1  Purpose 

This  chapter  extends  the  results  of  Chapter  2 by  a fuller  examina- 
^ tlon  of  the  contingency  decision  model. 

4.2  Introduction 

The  basic  example  in  Chapter  2 served  the  purpose  of  a suitable 
framework  for  the  development  of  the  basic  concepts  of  Information  dyna- 
^ mics.  However,  the  scenario--a  series  of  M a priori  decisions  with  no 

opportunity  to  benefit  from  the  information  gained  from  the  intervening 
outcomes- >may  be  considered  a somewhat  forced  and  contrived  exas^le  of 
contingency  decision  making.  A second  example,  more  natural  than  the 
first,  serves  to  amplify  the  previous  discussions. 

4.3  Example  Two. 

The  decision  maker  is  the  owner  of  a model  "A"  automobile.  His 
. r mechanic  has  recently  told  him  that  there  is  a 0.5  probability  the  car 

will  fail  two  years  hence  and  have  to  be  replaced.  However,  if  the 
car  does  not  fail  in  the  second  year,  then  there  is  a 0.5  probability 
that  it  will  fail  in  the  fifth  year.  Peculiar  to  the  model  "A"  is  the 
fact  that  if  it  does  not  fall  by  the  fifth  year  it  will  last  forever. 
Peculiar  to  the  decision  maker  is  his  unconcern  for  events  beyond  the 
tenth  year. 

The  decision  maker  is  an  advocate  of  the  model  "A".  Recently  he  has 
9 received  the  disquieting  neirs  that  the  company  has  in  the  past  few  years 

developed  a pattern  of  production  runs  that  results  in  one  year's  car  be- 
ing a "peach,"  but  In  many  cases  the  following  year's  model  is  a "lemon." 
The  company  englnears  feel  the  Markov  model  of  Pig.  4-1  captures  this 
pattern. 

The  decision  maker  also  has  the  option  of  buying  a model  "B",  a more 
reliable  but  more  expensive  car.  After  some  consideration  he  has  devel- 
oped a reward  structure  as  shown  in  Table  4-1. 
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^Lenon  at  year  M I Lemon  at  year  M-1,  e}  = 0.1 
P Peach  at  year  M | Lemon  at  year  M-2,  f}  e 0.9 
P, Peach  at  year  M | Peach  at  year  M-1,  e)  s o.? 
P{ Lemon  at  year  M | Peach  at  year  K-1,  cj  « 0.6 
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TABLE  4-1 

Reward  Structure,  Exantple  2 


. r, 


I 

r\ 


Buy  Model  "A" 

Buy  Model  "B" 

Model  "A"  a peach 

+100 

-100 

Model  "A"  a lemon 

-100 

+100 

What  now  Is  the  value  of  Information  concerning  the  quality  of  this 
year's  production  of  Model  "A's?"  Ue  can  recalculate  v*(n)  and  v*(n) 
keeping  in  mind  that 

P{outcome  switch  "on"  at  m»2|€}  *0.5 

p{outcome  switch  "on"  at  m«5 {switch  "off"  at  m*2,c}  ■ 0.5 

P{outcoine  switch  "on"  at  m»5|e}  - (0.5)  (0.5)  ■ 0.25 

The  values  of  v*(n)  , v*(n)  , A(n)  > and  p(n)  are  plotted  in  Fig.  4-2. 

Figure  3-4  suggests  an  extension  to  this  example.  Assume  some  cost, 
say  5 units,  is  incurred  each  transition  if  there  is  no  receipt  of  the 
profit  lottery.  The  cost,  for  instance,  might  be  maintenance  and  opera- 
tion of  the  Model  "A".  The  values  of  v*(n)  , v*(n)  , A(a)  > and  p(n) 
are  plotted  in  Fig.  4-3.  Comparison  of  the  last  two  figures  shows  that 
the  alteration  of  the  example  changes  N^(n)  and  v*(n)  but  not  A(n) 
and  p(n)  . We  also  note  in  both  figures  that  A(n)  is  non- increasing 
over  the  horison  of  interest,  but  p(n)  behaves  Inconsistently  with  our 
previous  results.  What  changes  are  needed  to  rationalise  this  behavior? 

4.4  The  "Outcome  Switch" 

We  first  must  fomalixe  the  probabilistic  nature  of  the  occurrence 
of  the  decision. 

Ut 

r(m)  ■ Y be  the  event  that  the  "outcome  switch"  is  "on"  at  tran- 
sition m , l.e.,  the  decision  amkar  receives  the  reward 
from  his  profit  lottery  at  transition  m , and 

rta)  ■ It  be  the  complementary  event  that  the  "outcome  switch"  Is 
"off"  at  tranaltlon  m , l.a.,  the  decision  maker  does 
not  receive  bis  rsmard. 

45 


Let 


g(m)  - P{r(m)-Y|el 
1 - g(jn)  - Ptr(m)-Nle} 


(4.1) 

(4.2) 


Ve  have  discussed  In  Chapter  3 (see  pages  40-41)  that  the  occurrence 
of  the  decision  may  be  highly  dependent  on  any  number  of  previous  events. 
Designate  such  events  as  E(l)  , E(2)  , E(3)  , ...,  E(m-2),  E(m-l)  , E(m) 
Then 

g(m)  - P{r(m)-yle}  - P{E(m)-YlE(m-l) E(l),€}  p{E(m- 1)  lE(m.2)  , . . . , 

E(1),«3  P{E(m.2)|E(m-3),...,E(l),€3  ...  P{E(l)|e}  (4.3) 

We  shall  assume  that  the  marginal  probability,  g(m)  , is  always 
available,  either  by  direct  assessment,  modelling,  computation  or  by  some 
combination  of  these  techniques. 

We  can  associate  a random  variable,  X(m)  , with  the  process  such 

that 


X(m)  - 1 if  r(m)  - Y 
X(m)  - 0 if  Tim)  - N 


(4.4) 


At  any  transition  m the  probability  mass  function  for  X(m)  is 
described  by  Pig.  4-4.  In  particular,  for  the  exas^le  we  have  just  de- 
scribed, the  mass  functions  for  X(m)  are  shown  in  Pig.  4-5. 

Vs  discern  that  receipt  or  non- receipt  of  the  profit  lottery  in  no 
way  affects  a priori  cerebral  consideration  of  the  optimal  strategy. 
Therefore,  we  redefine  (2.5)  as 


^v^’^(m)  - <v(«)  l6(m)-5^’‘\r(*)"Y,o  (4.5) 

where  the  subscript  "c"  emphasises  the  contingency  decision  making.  Then 
the  obvious  relationships  exist 

^v^'‘>(m)  - v^‘‘^m)  g(m)  (4.6) 

^v4(m)  - v*(m)  g(a)  (4.7) 

^>^(Bi)  - ^V*(m)  g(a)  (4.8) 
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CoaDar«ttve  Itetulti  from  Contingency  Dectslon  Making 


4.5.1  Th«  Bfftct  on 

The  lest  results  of  Section  4.4  lead  isnedietely  to 
Theorem  4.1 


^A(n)  s Mn) 

Proof 

M 

Lin)  - y tv*(jO  - v*(j0  3 (4.9] 

Mi-n 

end 

M 

gMn)  - ^ tv*(jO  g(jO  - v*(jO  g(jO]  (4. 

jHt-n 

Thus, 

M 

^A(n)  • y g(f)[v*(jO  - v*(jO]  s Mn) 
jH<-q 

4.5.2  The  Effect  on  a(n) 

The  effect  of  contingency  decision  making  on  p(n)  is  not  as  ob- 
vious as  the  effect  on  A(n)  . In  the  example, 

p(n)  Is  not  lass  than  | { , 

and 

p(n)  is  not  greater  than  p(n-l)  for  all  n , 


both  in  contradiction  to  previous  results. 

In  Ixaaple  1,  s(m)  ■ 1.0  , 0 < ■ < 10  , and  g(m)  ■ 0 , ■ > 10 
Moreover,  for  ■ 2 8 , V*(m)  - ▼*(•)  ■ 0 . In  Exasiple  2,  g(m)  ■ 0.5 
for  ■ ■ 2 ; g(ai)  ■ 0.25  , ■ ■ 5 ; and  g(m)  ■ 0 , otherwise.  Ob- 
viously f(m)  is  constant  in  Ixaaple  1 (at  least  for  the  transitions 
that  cause  a contribution  to  b(n)).  In  Ixaaple  2,  g(a)  is  both  in- 
eraasing  and  docreasing.  Consideration  of  tbeae  results  loads  to  a re- 
otatoaaat  of  Tbaoraa  3.3,  as 
TfctffW 


^p(a)  a )X^|  if  g(a)  is  non- increasing  in 


Proof 


Thera  exists  two  constants  cr  end  g (12ar,  020,  (y>0), 

such  that  0°  2 g(m)  for  all  m , end  ^ £ g(m)  for  ell  m . 
Then  one  may  show  that  Mn,ar)  2 A(n,B)  ^ Mn)  . Similar  to  the 
proof  of  Theorem  2.3  we  may  show  that 


and 

<*•“> 

A(n)  and  &(n-l)  are  continuous  In  g(m)  . Thus,  It  follows  that 
p(n)  Is  also  continuous  In  g(m)  . From  (4.11)  and  (4.12)  %re  con« 
elude  that  the  theorem  holds. 


The  converse  of  this  statement  does  not  hold  In  general  as  can  be 
seen  by  alteration  of  Example  1.  Suppose  g(0)  > 0.5  , g(l)  * 1.0  , and 
g(m)  ■ 0 , m ^ 0,1  . Then  ^10)  - 106.35  , and  A(9)  - 59.29  , or 
p(10)  ■ 0.56  £ {0.7 1 . This  Is  true  In  spite  of  g(0)  £ g(l)  2 g(2)  . 


4.6  TWO  Peripheral  Issues 

4.6..  1 Intermediary  Information  Acquisition 

Suppose  for  Example  3 «re  alter  Example  2 as  follows:  The  decision 
maker,  following  any  year  he  purchases  an  "A"  model  car,  has  the  proba- 
bilities 0.5  and  0.25  of  a required  successive  purchase  In  the  succeeding 
two  and  five  years.  To  be  specific-- If  the  decision  maker  buys  an  "A" 
model  In  year  2 then  he  has  a 0.5  chance  of  requiring  a successive  pur- 
chese  In  year  4 and  a 0.25  chance  In  year  7.  We  also  assxsse  he  receives 
perfect  Information  if  the  purchases  the  ear.  in  other  words,  he  knows 
at  (or  immediately  after)  the  time  of  purchase  If  the  current  year's  model 
Is  a "peach"  or  a "lemon."  He  can,  of  course,  use  this  data  in  succeeding 
years.  Whet  now  Is  the  value  of  Information  acquired  at  m ■ 0 , and  how 
do  A(n)  end  p(n)  chengaT 

Figure  4>6  depicts  this  behavior.  The  Important  consideration  Is 
the  fact  that  receipt  of  a second  Input  of  perfect  Information  destroys 
any  residual  incromantel  value  from  the  first  acquisition.  This  will  be 


of  key  concern  in  the  optlnal  strategy  of  Informetlon  ecqulsltlon  in  the 
following  chapter. 

C 4.6.2  Discounted  Rewards 

Whet  Is  the  effect  of  discounting  on  ^(n)  and  p(n)  ? Assume  In 
Example  2 that  the  stream  of  rewards  Is  discounted  by  some  factor  6 . In 
other  words,  the  present  value  of  the  expected  reward  at  the  transi- 
tion Is 

v*(m)  ■ p*v*(m)  , B ^ I'O  (4.13) 

P 

If,  In  particular,  ve  assume  that  for  Example  2 3 ■ 0.9  , then  the  ap- 

propriate values  for  gMn)  and  gp(n)  •»  plotted  in  Figs.  4- 7a  and 
4- 7b. 

Examination  of  Pig.  4- 7a  reveals  that  4(n)  Is  Increasing  In  viola- 
tion of  Theorem  2.2.  However,  this  Illusory  appreciation  of  Information 
Is  a violation,  not  of  the  theorem,  but  of  the  assuiiq>tlons  of  the  theorem. 
The  development  In  Chapter  2 was  premised  on  a constant  reward  structure, 
t Once  discounting  Is  Introduced  there  Is  no  longer  a constant  reward  over 

I time.  Therefore,  there  Is  no  guarantee  that  the  results  of  Chapter  2 

I Q would  remain  valid. 

4.7  Susmarv 

This  chapter  has  developed  the  foundations  for  the  contingency  decl- 
Q slon  model  and  the  effects  that  contingency  decision  making  has  on  the 

basic  results  of  Chapter  2.  The  important  result  that  Information  perishes 

! within  the  context  of  the  decision  model  leads  logically  to  the  next  chap- 

1 

ter,  a consideration  of  the  strategy  of  Information  replenlshnient. 
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CHAPTER  5 

INFORMATION  REPLENISHMENT 


5.1  Introduction 

This  chapter,  in  a sense,  begins  the  obverse  of  the  previous  results. 
Information  perishes  over  time.  What  strategy  of  information  replenish- 
ment reverses  this  perishing? 

In  the  static  case  the  question  of  information  replenishment  is 
straightforward.  If  the  expected  value  of  information  is  greater  than 
the  cost  of  gaining  the  information,  then  the  prudent  strategy  is  to 
acquire  the  information.  However ^ the  issue  is  far  more  complex  in  the 
contingency  decision  model.  The  decision  maker  faces  this  paradox.  If 
he  acquires  information  at  m > 0 , the  information  may  have  perished  by 
the  time  the  decision  actually  occurs.  Conversely,  he  may  delay  his  in- 
formation gathering  and  be  caught  short,  having  to  make  his  decision  on  a 
less  than  complete  state  of  information.  He  stay  call  this  process 
speculative  information  acquisition  as  it  has  many  of  the  characteristics 
that  futures  speculation  has  in  any  commodity  market. 


5.2  Two  Examples 


5.2.1  A Simple  Case— Example  4 

He  illustrate  with  a trivial  but  useful  example.  Let  the  state  of 
information  be  characterised  by  the  two-state  Markov  model  that  we  have 
previously  used  (see  Fig.  2-1,  page  7). 

The  alternatives  and  reward  matrix  are  those  used  in  Example  1 (see 
pages  5-6).  He  further  assume  that  the  decision  occurs  with  probability 
1.0  at  m * 5 and  m ■ 8 and  with  probability  0 otherwise.  Utilising 
the  notation  of  Chapter  4 we  have 


8(5)  - 8(8)  - l.O 
8(m)  - 0.0  , m # 5,8 


(5.1) 


The  decision  maker  has  a planning  horison  of  eight  transitions. 

Tha  dec is ion/ information  acquisition  modal  is  shown  in  Fig.  5-1. 

He  should  particularly  note  from  the  su>del  that  information  acquired  at 
tha  m^^  transition  is  not  available  for  a decision  at  tha  m^^  transition 
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Inforaatlon  Decision  Occurs? 

Acquired? 


Figure  5-1  Decision/infoxnation  acquisition  model 


f 


o 
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but  is  available,  at  the  earliest,  at  the  nfl^^  transition.  At  this 
time  it  has  perished  for  one  unit  of  time. 

To  return  to  the  example  we  may  establish  a base  case  that  would 
consist  of  an  expected  reward  of  +5.88  at  m > 5 (based  on  prior  in- 
formation only)  and  an  expected  reward  of  33.83  at  m * 8 (based  on 
acquisition  of  perfect  information  as  a result  of  observing  the  outcome 
of  the  decision  at  m ■ S).  (These  expected  rewards  were  established 
in  Chapter  2.) 

It  is  transparently  clear  that  although  the  combination  and  permuta- 
tions of  schemes  of  data  acquisition  are  almost  unlimited  that  (assuming 
the  cost  of  information  gathering  is  constant  over  time)  only  four  merit 
consideration: 

6^^^--No  acquisition; 

— Acquire  perfect  information  at  m ■ 4 only; 

(2) 

6 --Acquire  perfect  Information  at  m 7 only; 

(3) 

6 --Acquire  perfect  information  at  m ■ 4 and  m ■ 7 . 

Table  5-1  simmariaes  the  results  for  each  alternative. 

TABLE  5-1 

Summary  of  Expected  Rewards 


Figure  S>2  la  a plot  of  Colusm  (5)  of  Table  5-1. 

Aa  la  obvloua  from  the  figure,  la  dominated  at  all  costs,  C . 

However,  the  choice  amotig  the  other  eltemetlves  la  a function  of  the 
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cost  of  the  information  gathering  program.  At  low  costs  the  extensive 
program  Is  preferred,  at  Intermediate  costs  the  restricted  program  Is 
dominant,  while  for  costs  over  63.53  the  best  strategy  Is  to  forego  In- 
formation acquisition. 

In  some  situations  control  and  planning  of  such  an  aperiodic  acquisi- 
tion program  would  be  difficult.  A periodic  gathering  of  Information, 
while  perhaps  less  optimal  from  a strict  economic  viewpoint,  might  be  far 
easier  to  monitor  and  Implement.  We  shall  designate  the  spacing  of  the 
periodic  replenlstsnent  by  the  parameter  "s".  Aa  an  example,  s * 8 signi- 
fies acquisition  of  perfect  Information  at  transitions  0,  8,  16,  24 

We  would  particularly  note  that  In  Example  4 acquisition  of  perfect  In- 
formation at  ffl  ■ 5 and  m •>  8 Is  redundant  and,  hence,  worthless,  as 
the  decision  maker  will  receive  perfect  information  as  a result  of  observ- 
ing the  outcome  of  his  decision  at  these  transitions.  Figure  5-3  depicts 
In  a qualitative  sense  the  process  of  periodic  Information  replenishment. 

Table  5-2  summarizes  the  results  of  periodic  programs  applied  to 
Example  4. 

TABLE  5-2 


Periodic  Replenishments,  Example  4 


s 

m 

Total  Expected 
Reward  (net) 

(2) 

Number  of 
Acquisitions 
(3) 

Notes 

(4) 

8 

0.00 

0 

Equivalent  to  Base  Case 

7 

35.57 

1 

6 

15.34 

1 

5 

0.00 

1 

No  value  for  Information  at  5 

4 

63.53 

1 

No  acquisition  at  8 

3 

58.64 

2 

2 

78.87 

3 

No  acquisition  at  8 

1 

99.10 

7 

Figure  5>4,  the  parallel  to  Fig.  5-2,  assists  in  visualising  the 
dominaac  altamativas.  Ixsaination  of  the  figure  reveals  a pattern  similar 
to  that  of  the  previous  example. 

we  now  turn  to  a bit  more  subtle  example. 
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5.2.2  G*oiDetrlcallv  Dt>trlbuted  Decision  Occurrences 
Ma  alter  the  ecetiario  as  follows:  Assume  the  information  model,  the 
alternatives,  and  the  rewards  remain  unchanged.  However,  g(m)  Is  now  a 
geometric  distribution.  (The  decision  maker  might  elect  this  model  for 
the  decision  occurrence  because  of  a feeling  that  It  adequately  represents 
his  state  of  Information.  In  addition,  the  distribution  has  some  "bench- 
mark" properties  which  will  be  useful  In  the  further  development.) 

In  particular,  let 


g(m)  - (0.2)(0.8)“"^  , 1 Sm  s 20 

> 0 , otherwise 


(5.2) 


The  distribution  Is  represented  by  Fig.  5-5,  and  we  note  that  the 
decision  occurs  but  once. 

Ue  limit  the  replenishment  to  periodic  acquisitions  and  further 
assume  that  the  horlson  Is  an  Integer  multiple  of  the  replenishment  period. 
In  other  words,  for  the  example  %re  are  developing  acquisition  can  occur 
only  at  periods  of  1,  2,  4,  5,  and  10  transitions.  One  can  gather  Infor- 
mation fruitfully  only  through  the  nineteenth  transition  as  no  decision 
will  occur  after  m • 20  . 

The  situation  unfolds  as  follows:  The  decision  maker  may  elect  some 
period  of  Information  acquisition,  say  every  transition.  He  pays  a 

cost  C to  acquire  Information  prior  to  the  first  transition  which  en- 
ables him  to  set  his  decision  vector  for  transitions  1,  2,  ...,  i . He 
can  calculate  the  expected  value  of  this  Information.  At  transition  1 
he  again  acquires  information  If  the  decision  has  not  occurred  prior  to 
i.  For  the  geometric  distribution  In  the  example  the  probability  Is 
1 - (0.8)^  that  the  second  acquisition  will  occur.  If  the  decision 
maker  acquires  information  the  second  time,  he  uses  this  update  to  set 
his  decision  vector  for  transitions  At-1,  Af2,  ...,  2f  . The  process  is 
repeated  through  the  horison,  M . 

We  illustrate  the  numerical  approach  by  creation  of  Table  5-3. 

Coluen  (5)  of  the  table  is  the  total  e^^cted  profit  from  the  information 
acquired.  The  net  profit,  that  is  the  total  less  the  expected  cost,  we 
designate  by  V^(n,s«D  where  n is  the  transitions  to  go,  and  s ■ 1 
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TABLE  S-3 

Expected  Rewards,  Geometric  Distribution 


f.'  r 

f '■ 

If  Decision 
Occurs  at 

m ■ 

Cl) 

IHelH 

Probability 
Decision  Occurs 
g(m) 

(3) 

Expected 

Reward 

^v»(m) 

(4) 

Cumulative 

Reward 

E^v*(m) 

(5) 

. c 

1 

69.41 

0.200 

13.88 

2 

49.18 

0.160 

21.75 

< 

3 

33.84 

0.128 

4.33 

26.08 

4 

24.16 

0.102 

2.46 

28.54 

r 

5 

16.29 

0.082 

1.33 

29.87 

6 

12.07 

0.066 

0.79 

30.66 

i 

7 

7.86 

0.052 

0.41 

31.07 

j 

8 

5.99 

0.042 

0.25 

31.32 

! f 

1 

9 

5.88 

0.033 

0.20 

31.52 

r 

i 

10 

5.88 

0.027 

0.16 

31.68 

^ implies  acquisition  of  information  each  transition.  We  may  express 

, as  an  example,  as 

V^(20,s-10)  ■ -C  + r[v*(J)  - v*(J)]  (^{decision  occurs  at  jjc) 

-(1  - 0.8^^)C  + E[v*(J)  - v*(J)]  (^(decision  occurs 

a 

at  J+10|e) 

10 

- -C  + ^ [v*(J)  - v*(J)]g(J) 

0 J-1 

10 

•W(10){-C+  ^ [v*(J)  - v*(J)]g(J)}  (5.3) 

J"1 

0 Wia  may  develop  similar  expressions  for  s ■ 1,  2,  4,  and  S . These 

are  plotted  in  Fig.  5-6. 

n 
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5.3  Rules  of  Qptlaallt^ 


He  shell  now  develop  some  rules  of  optimality  with  proof  of  each. 

The  following  statement  of  the  problem  applies  to  all  rules.  The  horizon 
Is  M transitions.  The  periodic  acquisition  of  Information  occurs  at 
Intervals  k , f , etc.,  such  that  Kk  > Lf  ■ M . The  cost  of  one  acqui- 
sition Is  C • and  the  cost  Is  linear,  i.e.,  two  acquisitions  imply  2C  . 
The  decision  occurs  but  once  and  Is  governed  by 


g(m)  ■ f“(l-f)  , 1 £m  sM 


(5.4) 


Rule  1 


V^(M,s-i) 


1 - f 


c: 


where 


ail)  - ^ [v*(J)  - v*(J)]  g(J) 
J-1 


(5.5) 


(5.6) 


Proof 


The  expression  follows  from  recognition  of  three  results.  The 
first  Is  that  . . 

1 - f* 

The  second  Immediately  follows  as  Li  ■ M by  assun^tlon.  The  third 
result  comes  from  the  "memoryless"  property  of  the  geometric  distribu- 
tion. Assume,  for  exao^le,  the  period  of  acquisition  is  i , and  we 
are  interested  in  the  k^^  acquisition.  The  expected  value  Is  then 


-G(ki)C  + Y t''(J)  - v(J)]  g(kl+J) 

J 

However,  we  may  also  write  (5.7)  as 

-C(ki)C  + C(kl){^  Iv*(J)  - v*(J)]  g(J)} 


(5.7) 


(5.8) 


This  expression  holds  for  k ■ 0,  1,  2,  ...  . 

Rule  2 

Let  V^(M,s>|*)  represent  the  optimal  acquisition  policy  for  boos 
value  of  C . The  graph  (see  Pig.  5-6)  is  piecewise  linear  in  C . 
Proof;  Zanedieta. 
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Rule  3 

No  ecqulsltion  policy  it  domlnent  for  ell  values  of  C . 

Proof 

For  dominance  some  policy,  say  acquisition  at  each  transition, 

must  yield  a value  of  that  Intersects  the  axis  at  a value 

greater  than  any  other  policy,  l.e..  If  C * 0 , 

Vj(M,s-D  s V^(M,s»k)  , k ^ i 

Also  the  policy  must  Intersect  the  C axis  at  a value  greater  than 
any  other  policy  (see  Fig.  5-7).  For  C ■ 0 , Is  obviously  max- 

imised for  i ■ 1 and  decreases  In  jt  . On  the  cost  axis  the  Inter- 
section occurs  at  (y(jO  ■*  C ■ 0 . By  the  definition  of  (S.6), 

we  know  that  cKD  Increasing  function  of  i . Therefore,  no 

policy  Is  dominant. 

Rule  4 

Assume  a(4)  - ^ 0 for  some  t . Then  the  optimal  policy  Is 

determined  by 


oKjB*)  - C oKjO  - C 


1 - f 


f* 


(5.9) 


If  a(D  - ^ 0 for  all  t i then  jC*  ■ <b  (no  acquisition). 

Proof 

This  follows  directly  from  Rule  1. 

Rule  4 furnishes  a fairly  tractable  determination  of  the  optimal 
policy  for  sane  particular  . For  Instance  In  the  example  we  might 
let  ■ 10  . Than  wa  could  construct  Tabla  5-4,  confirming  what  was 
graphically  daplctad  in  Fig.  5-6.  i 


TABLE  5-4 

Optimal  Policy  for  tha  Bxaaipla  Problam  (f  ■ 0.8) 


■ - 1 

o(0  - 

1-  f^ 

1 

12.70 

13.50 

2 

19.63 

26.75  Optimal 

4 

25.09 

25.56 

5 

25.95 

23.72 

10 

26.47 

18.45 

68 


th 


Rul«  5 

Assume  the  value  of  the  information  perishes  by  the  m transition 

counting  forward;  i.e.,  v*(J)  " v*(j)  ■ 0 , jam.  Then  1*  i m 

for  all  C . 

o 

Proof 

Rule  4 requires 


a(f*)  - C cr(«)  - C 


1 - f^  1 - f" 


> m # jt* 


(5.10) 


Assuote  a m in  contradiction  to  Rule  5.  By  the  statement  of  the 
problem  ■ aim)  . Therefore,  inequality  5.10  reduces  to 


1 - f' 


1-  f" 


(5.11) 


or 


f^  a ^ (5.12) 

which  is  false  for  £*  a m . Therefore,  jt*  must  be  less  than  m , 
%dilch  completes  the  proof. 

These  rules  will  be  useful  in  analyzing  other  distributions. 

5.4  Other  Markovian  Distributions 

5.4.1  Increasing  (Decreasing)  Decision  Occurrence  Rates 
Ha  adapt  a concept  from  Wagner  [23]  to  define  an  increasing  (decreas- 
ing) decision  occurrence  rate.  We  shall  use  a special  breed  of  Markov 
chain  where  one  state  which  we  shall  designate  J*  is  the  state,  "Deci- 
sion occurs,"  and  the  other  states,  i • 0,1,...  , are  directly  associated 
with  transitions  that  have  no  decision  occurrence.  We  let 


r(t) 


(5.13) 


and  define  an  increasing  (decreasing)  occurrence  rate  if  r(i)  is  in- 
creasing (decreasing)  in  i . Roughly  this  translates  as  the  decision 
■eksr  feels  that  as  time  passes  and  given  that  the  decision  has  not  oc- 
curred the  probability  of  the  decision  occurring  increases  (decreases). 
The  distributions  of  figs.  5-8a  and  5-8b  depict  increasing  and  decreasing 


70 


9 


0 


0 


n 


n 


occurrence  retes,  respectively.  Figure  5-9  Is  a graph  of  for  the 

three  cases:  Increasing,  constant,  and  decreasing  occurrence  rates. 


5.4.2  Optimal  Acquisition  Policies 

We  note  In  Fig.  5-9  that  for  the  Increasing  (decreasing)  occurrence 
rate  the  optimal  acquisition  period  for  a given  Is  less  than  or 

equal  (greater  than  or  equal)  the  period  for  the  "constant"  occurrence 
rate.  We  now  show  this  Is  a general  result. 

Theorem  5.1 


Let  {p  define  a Markov  chain  as  defined  In  the  previous  para- 

graph.  If  p.  . Is  Increasing  (decreasing)  In  1 , then  for  a 
particular  cost  of  Information  acquisition  , the  optimal  periodic 
acquisition  occurs  at  a period  s where  s Is  less  than  or  equal 
(greater  than  or  equal)  that  s determined  by  Rule  4. 


Proof 

The  proof  proceeds  by  Induction  on  s , the  acquisition  period. 

The  proof  will  be  for  the  Increasing  case  as  the  decreasing  case  Is 
then  obvious.  The  proof  is  based  on  determining  the  values  of 
for  which  the  decision  maker  Is  Indifferent  between  acquisition  at 
each  transition  and  at  every  second  transition.  We  then  show  for 
the  Increasing  case  that  this  is  larger  than  that  for  the 

constant  occurrence  case.  This  Implies  that  the  optimal  acquisi- 
tion period  is  less.  The  Intersection  for  the  "constant"  case  Is 
determined  from 


V^(M,s-l)  - V^(M,s-2)  (5.14) 

Using  (5.3)  we  express  the  equality  as 

-C^[G(0)-W(l)+...-lC(M.l)]  + [g(l)+g(2)+...+g(M)]  [v*(l)-v*(l)]  - 
-C^tG(0)-»5(2)+...-W(M-2)l  + [g(l)+g(3)+...+g(M-l)l  [v*(l)-v*(l)  ] 

+ tg(2)+g(4)+...+g(M))  tv*(2)-v*(2)  ] (5.15) 


Let  the  equeting  value  of  C in  (5.15)  be  C (1)  and  solving  for 

o o 

this  quantity  yields 
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0 


ft 


[v*(l)  - V*(2)]  (5.16) 


f 


f 


C , ,gC2)  t-8C»)  + SW 

° G(l)  + G(3)  + ...  + G(M-1) 

The  nvB&erator  of  (5.16)  is  equivalently  expressed  as 


G(l)Pi,j*  + G(3)p3  j*  + ...  + G(M.1)Pjj^^  (5.17) 

Thus,  for  (5.16)  we  may  write 

G(l)p,  .*+  G(3)p  + ...  +G(M-l)p 

C (1) Z [v*(l)-v*(2)] 


G(l)  + G(3)  + ...  + G(M-1) 


(5.18) 


Let  C^(l)  represent  the  equating  value  for  a Harkov  chain  with  an 
increasing  decision  occurrence  rate.  Then 


<(i) 


Gd)'*'  + G(3)‘^  + . . . + G(M-1)‘*‘ 


[v*(l)-v*(2)] 


(5.19) 


One  may  then  coiapare  the  right-hand  sides  of  (5.18)  and  (5.19). 
Since  P^j*  ^ ^ij*  values  of  i « M , the  conclusion  is 


c;(i)  * c„(i) 


(5.20) 


This  situation  is  depicted  in  Fig.  5-10.  The  implication  is  that  the 
decision  maker  would  adhere  to  a "s  ■ 1"  policy  at  a greater  cost  for 
the  increasing  occurrence  rate  situation.  Thus,  the  optimal  acquisi- 
tion policy  for  tha  Increasing  occurrence  rate  is  s ■ 1 while  the 
optlsial  policy  for  the  constant  rata  Is  s ■ 1 or  s - 2 (or  per- 
haps even  higher).  Therefore,  we  have  proven  the  assertion  for 
s - 1 . 

Assvsm  the  results  hold  for  s • k . He  shall  now  prove  the  theorem 
for  s ■ k - 1 , the  Intersection  of  the  k - 1 and  k policies. 

Ms  can  express  V,j(M,s«k)  as 

k-1  k k-1 

Vj(M,s-k)  ■ ^ ^ g(ik+i)[v*(f)  - v*(£)]  - ^ G(tk)Cp  (5.21) 

i»0  £■!  1-0 


i 


f 
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Similarly,  ve  write 


V^(M,8-k-l)  - 
k k-1 


^ ^ g[i(k-l)+£Jtv*(X)-v*(X)].  ^ G[i(k-l)]C^  (5.22) 


i-0  i-l  i-0 

At  the  Intersection  the  two  expressions  are  equal  or 

k-1  k k-1 

^ ^ g(ikfjt)[v*(jt)-v*(i)]- ^ G(ik)C^  - 

i*0  Jl*l  i«^ 


k k-1  k 

^ ^ g[i(k-l)+i][v*(l)-v*(jl)]- ^ G[i(k-l)]C^  (5.23) 

i-0  i-1  i-0 

The  assertion  of  the  theorem  is  that  C (k-1)  as  determined  by  the 
solution  of  (5.5)  is  less  than  C^(k-l)  as  determined  by  (5.23). 

If  this  is  true,  then  one  may  substitute  C^(k-l)  into  (5.23),  and 
an  equality  would  no  longer  exist.  Instead  the  left-hand  side  would 
be  less  than  the  right-hand  side.  Vfe  may  show  that 


C^(k-l)  - o(k-l)  - d-f’^"*)  [v*(k)  - v*(k)] 
where  o(k-l)  was  defined  by  (5.6).  We  also  recognize  that 


(5.24) 


a(k-l)  - ^ (1-f)  f^‘^[v*(J)  - v*(J)] 

J-1 


(5.25) 


Thus,  C^(k-l)  is  a function  of  k terms  Involving  v*(J)  end 
v*(J)  « J - 1,2,..., k-1  . We  can  represent  this  as 


C^(k-l)  - ZC(J)  [v*(J)  - v*(J)]  - Ev'(J) 


(5.26) 


Substitution  of  the  expression  for  C^(k-l)  into  (5.23)  yields 


^ ^ g(ik+i)[v*(i)  - v*(£) 

1-0 


g[i(k-l)+Jl]tv*(jt)  - v*(A)  - v'(£)]  (5.27) 


This  completes  the  proof  of  the  theorem  for  the  Increasing  rate  case 
The  opposite  conclusions  hold  for  the  decreasing  rate  case. 

We  have  seen  In  this  section  the  utility  of  the  geometric  distribu- 
tion as  a benchmark  for  any  singly  occurring  decision.  The  next  section 
extends  this  property  to  multiple  occurring  decisions. 


5.5  Repetitive  Decision  Situations 


5.5.1  The  Model 


To  this  point  «e  have  examined  decisions  that  occur  but  once.  How- 
ever the  process  may  be  repetitive.  The  decision  occurrence  Is  described 
by  a probability  mechanism.  The  true  state  of  nature  Is  revealed  to  the 
decision  maker  after  each  occurrence » and  the  process  continues  to  the 
horizon.  We  could  model  the  process  as  shown  In  Fig.  5-11 

We  may  also  develop  an  equivalent  Information  Value  Hodel  as  shown 
In  Fig.  5-12. 

States  1 through  9 of  Fig.  S-12  represent  the  values  of  perfect  In- 
formation after  one  through  nine  transitions.  These  values,  as  Initially 
presented  in  Table  2-2  are  69.41,  49.17  and  so  on  down  to  5.88  at  state  9 
We  also  note  that  state  1 awy  be  entered  only  through  Information  acquisi- 
tion. Without  the  dallbarata  acquisition  the  value  of  information 
parishes  through  a minimum  of  two  transitions  between  decision  occur- 
rences . 

He  offer  some  rationale  for  the  model.  For  the  deliberate  acquisi- 
tion situation  we  hypothesise  that  the  decision  maker  anticipates  re- 
ceipt of  the  information,  and  he  has  his  resources  mobilised  to  act  on 
the  information  after  a delay  of  one  period.  However,  information  gained 
as  a result  of  <Aserving  the  true  state  of  nature  after  a decision  oc- 
currence has  an  element  of  surprise,  and  two  periods  are  required  for 
reaction. 


Th«  MqiMnc*  of  Ch«  action  la  alao  partinant.  Tha  daclalon  aakar 
firat  alacta  to  acqulra  or  not  to  acquira  inforaatlon.  Followlnf  thla  ha 
flnda  out  if  tha  daclalon  occura  or  not  on  that  particular  tranaltlon. 

5.5.2  Tha  Valua  of  lafofatign  ioj  a Spaclflr  Inwrlt 

Lat  f " 0.8  and  tha  horlaon  M ■ 20  la  tha  aodal  of  Tig.  5>ll. 

Ua  nay  dataralna  tha  axpactad  valua  of  a no-lnfomatlon  haaa  caaa.  Tha 
analyala  procaada  by  aithar  taulnt  "apaclal"  tranaltlona  (aaa  Howard 
[2])  or  by  axpactad  atata  occupanclaa.  Tha  calculation  of  tha  occupan- 
claa  of  atata  2«  aa  an  axaaiplat  la  aiaipllflad  by  collapaing  tha  raaalnlnt 
atataa  aa  ahown  la  Fig.  5>13. 

Lat  2^*)  ^ axpactad  occupanclaa  of  atata  2 condlclooad  on 

atartlag  In  atata  9 and  n tranaltlona  having  occurrad.  Than 


a a 1 


(5.28) 


Tran  tha  nodal  wa  raallaa  that  only  0.2  of  tha  occupanclaa  of  atata 
2 raault  la  daclalon  occurrancaa  and  tha  accruing  of  tha  49.17  raward. 

Alao  If  tha  horlaon  la  M . only  tha  occupanclaa  la  1^1  tranaltlona 
ara  of  Intaraat  aa  ona  tranaltlon  la  raqulrad  to  go  fron  tha  infoinatlon 
atata  to  tha  daclalon  occurranca  atata.  Wa  conaldar  atataa  thiaa  through 
alaa  la  a alailar  nonoar  and  arrlva  at  a haaa  caaa  valua  of  73.22,  aub- 
atantlally  graatar  than  tha  ba^  5.81  for  tha  alngly  occurring  daclalon 


Tha  proeaaa  of  Infomatlon  « -^'.altlon  la  aqulvalaat  to  atartlag 
tha  nodal  In  atata  1.  for  axonpla.  It  a • 1 , than  tha  axpactad  valua 
la  (20  occupanclaa  of  atata  1) (0.2) (89. 41)  ■ 277.84  . Thua,  uolng  tha 
notation  of  faction  5.2  wa  bnva 

T^(20,a-1)  - (20) (0.2) (89.41)  - 73.22  • 20C 

- 204.41  • 20C  (5.29) 


Ha  davalop  alailar  axpraaalona  for  a ■ 5,  4,  and  2 . Tha  grapha 
of  aach  ara  plottad  In  Fig.  5>14  along  with  coaparabla  grapha  fron  tha 
alngly  occurring  caaa. 
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5.5.3  General  Rules  of  Opttmalltv 

We  shall  now  develop  rules  of  optimality  to  parallel  those  presented 
in  Section  5.3  for  the  singly  occurring  decision.  The  following  state- 
ment of  the  problem  applies:  The  horizon  is  M transitions.  The 
periodic  acquisition  of  Information  occurs  at  intervals  k , f , etc., 
such  that  Kk  ■ LX  > M . The  cost  of  one  acquisition  is  C , and  the 
cost  is  linear  as  previously  defined.  The  probability  of  the  decision 
occurring,  based  on  the  model  of  Fig.  5-11,  is 

8(n>)  - TTl  - •rH(^  - . “20  (5.30) 


Rule  1 

The  expected  state  occupancy  of  state  J in  m transitions  condi- 
tioned on  starting  in  state  0 , the  no  information  state,  is  given 
by 


(i>. 


1- fri-Cf-i)"'^-^'^^!  j.2 

2- fL  2-f  J * 


(5.31) 


Proof 

The  proof  follows  from  Markov  state  occupancy  mechanics. 

fcile  2 


The  expected  state  occupancy  of  state  J in  n transitions  condi- 
tioned on  starting  in  state  1 aa  shown  in  Fig.  5-13  is 


1 , J - 1 . 

♦ fI[*  - <j-»]  J"' 


(5.32) 

Proof 

The  proof  again  follows  from  Markov  mechanics. 

itelt  3 

The  net  expected  reward  conditioned  on  acquiring  information  at 
•very  transition  is 

Vy(M,s-x)  - j(i-l)(l-f)[v*(J)]-c} 

J 

* (5.33) 

J 


83 


I 


I 


1 


r 


f 


t 


r 


f 


f 


Proof 

The  result  is  derived  from  Rules  1 and  2 and  the  definitions  of 
the  states  used  in  the  model. 

Rule  » 

Let  represent  the  optimal  acquisition  policy  for  some 

value  of  C . The  graph  (see  Fig.  5-14)  is  piecewise  linear  in  C . 
Proof 

Inmediate . 


There  appears  to  be  no  readily  tractable  method  for  determining  the 
value  of  t*  , the  optimal  acquisition  period,  for  some  particular  value 
of  C . However,  we  may  show  that  it  is  less  than  some  value  m where 
m is  the  transition,  counting  forward,  where  the  information  has  perished. 
Rule  S 

Assume  the  value  of  information  perishes  by  the  m^^  transition  count- 
ting  forward;  i.e.,  v*(J)  - v(J)  • 0 , jam.  Then  £*  i m for 

all  C . 

Proof 

Assume  the  existence  of  two  optimal  policies  k*  and  t*  such  that 
Vj(M,8-k*)  a Vj(M,s-J)  for  all  jam,  and  Vj(M,s-je*)  i V^(M,s-j) 
for  all  J £ m . In  particular,  by  assumption,  V^(M,s*k*)  2 
V^(M,s*m)  . This  inplies  that 

k*  m 

[v*(J)  - v*(J)]}  (v*(J)  - v*(J)]j  (5.34) 

J-0  J-0 


However,  by  the  concept  of  information  perishing  the  bracketed  sum- 
mations are  equal  which  implies 


(5.35) 


which  is  false  for  k*  2 m . Therefore, 
equal  to  m must  be  the  optimal  policy. 
Rule  6 

Mo  acquisition  policy  is  dominant  for  all 
Proof 

Refer  to  Pig.  5-14.  The  intercept  on  the 
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values  of  C . 
reward  axis  obviously 
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decreases  with  increasing  I . However,  the  intercept  on  the  cost 
axis  increases  with  increasing  l . Thus,  no  policy  dominates. 

Rule  7 

Let  C(l)  be  the  cost  for  which  the  .decision  maker  is  indifferent 
between  s ■ 1 and  s ■ 2 . Then 

C(l)  - (l-f)v*(l)  - f(l-f)v*(2)  (5.36) 


Proof 

Let  B > expected  reward  for  the  no  information  case.  We  can  write 
the  equating  value  for  the  s * 1 and  the  s * 2 case  using  (5.33) 
as 

M[(l-f)v*(l)  - C(l)]  - B .|[(l-f)v*(l)  + f(l-f)v*(2)  - C]  - B 

(5.37) 

The  results  follow  from  solution  of  (5.37).  This  value  will  be 
useful  in  comparison  with  the  singly  occurring  case. 

We  had  earlier  noted  the  use  of  the  "constant"  occurrence  case  to 
establish  a bound  on  the  optimal  acquisition  period.  This  case  also 
serves  as  a benchmark  for  the  multiple  occurring  decision  although  the 
results  are  surprisingly  different. 

Theorem  5.2 

Let  C (k)  be  the  cost  of  information  acquisition  for  the  constant 
o 

rate,  singly  occurring  decision  case  for  which  the  decision  maker 
is  indifferent  between  s ■ k and  s ••  k 1 . Let  C(k)  be  simi- 
larly defined  for  the  multiple  occurring  case.  Then  C(l)  2 C^(l) 
but  C(k)  £ > b > 1 . 

Proof 

By  Rule  7 

C(l)  - (l-f)v*(l)  - f(l-f)V*(2)  (5.38) 

From  (5.5)  one  may  show  that 

C^(l)  • (l-f)[v*(l)  - v*(2)]  (5.39) 

Comparison  of  (5.38)  and  (5.39)  proves  the  assertion  regarding  C(l) 

and  C (1)  . The  remainder  of  the  proof  parallels  the  proof  of 
o 
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Theorem  5.1.  For  the  multiple  occurring  case  we  determine  C(k) 
by  setting 

V^(M,s-k)  - V^(M,s-fcfl)  (5.40) 

Then  (5.32)  may  be  used  to  evaluate  Vj(M,s»k)  and  V^(M,s=k+l) 
to  yield 

k 

^ [(l-f)v*(l)  + (l-f)v*(j)  - C(k)]- 

j-2 

k+1 

~[(l-f)v*(l)  + ^ “JijW  (l"Ov*(j)  - C(k)]  (5.41) 

j-2 

We  determine  C^(k)  by  use  of  (5.24).  The  theorem  requires  that 
the  substitution  of  (5.24)  Into  (5.41)  destroys  the  equality  and  re- 
sults In  an  Inequality  with  the  left-hand  side,  that  Is  the  side 
with  the  greatest  number  of  acquisitions,  being  the  lesser  side. 
Substitution  of  (5.24)  into  (5.41)  confirms  this  result  and  proves 
the  theorem. 

These  results  are  obviously  not  as  "clean"  as  those  of  the  singly  oc- 
curring decision  as  there  Is  some  uncertainty  where  the  cross-over  In  the 
bound  occurs.  However,  the  ease  of  determining  the  values  for  the  con- 
stant occurrence,  singly  occurring  case  recommends  Its  use  for  these 
rough  estimates. 

5.6  The  Case  for  Periodic  Replenishment 

We  have  suggested  that  the  optimal  policy  In  some  Instances  Is 
strictly  periodic  acquisition  of  Information.  An  approach,  following  a 
development  of  Barlow  and  Proschan  [9]  rigorously  supports  this  conten- 
tion (under  certain  limiting  conditions) . 

Suppose  that  the  time  between  deliberate  Information  acquisitions 
Is  described  by  some  distribution  function  F(K)  (and  density  function 
f(X)).  This  B*B*ratcs  a series  of  Inforaatlon  acquisitions;  the  time 
between  each  acquisition  Is  a random  variable,  X (see  Fig.  5- 15a).  The 
X's  are  Identically  and  Independently  distributed  random  variables  which 
we  shall  deslgnata  • 
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similarly  let  the  time  between  declaion  occurrences,  Y , be  gene- 
rated by  G(y)  • The  set  of  Y's,  designated  by  , are  also 

Identically  and  Independently  distributed  (Fig.  S-15a). 

Further,  define  a third  set  of  random  variables, 

(see  Fig.  5- 15b).  The  Z's  delineate  a series  of  information  replenish- 
ments, some  of  which  are  by  design  and  some  of  which  occur  "free"  In  that 
they  result  from  observation  of  the  outcome  of  a decision. 


N^(0)  • number  of  acquisitions  by  design  by  transition  m (5.42) 


,(m)  ■ number  of  acquisitions  by  decision  occurrence 
by  transition  m 


(5.43) 


N(m)  Hj^(m)  + N2(m)  - total  acquisitions  by  transition  m (5.44) 


From  the  definition  of  Z the 


Pr(ZA3  - 1 - G(k)F(k) 


(5.45) 


and  one  may  show  that 


<2|e>-  ^G(k)F(k) 


(5.46) 


Therefore, 


liB  L 


^G(m)F(m) 


(5.47) 


TWO  indicator  random  variables  will  be  useful  in  the  development. 


1 If  Z|^  ■ (replenishment  by  design) 


(5.48) 


0 otharwise 


1 if  (rcplcnlstanent  by  decision 

outcome  observation) 


0 otherwise 


(5.49) 


Reflection  shows  that 


<V|e>  - Plr{XsY}  - ^ P(m)g(m) 


(5.50) 


<«|e>  - pr(Y«}  - ^ G(m)f(m) 


(5.51) 


These  indicator  random  variables  "identify"  the  method  of  information 
replenishment. 

US  can  describe  the  average  reward  per  transition  for  an  infinite 
horison  process  by  two  terms:  the  first  is  the  reward  accruing  to  the 
decision  maker  if  the  decision  occurs,  and  the  second  is  the  cost  of 
deliberate  information  replenishment. 

The  cost  is  a constant  which  we  shall  label  . We  recollect  that 
we  receive  perfect  information  from  both  the  deliberate  acquisition  and 
from  observing  the  system  as  the  decision  occurs.  Let  the  value  of  this 
perfect  information  be  C*  and  the  reward  at  k transitions  later  be 
C'(k)  . Let 


C2(k)  - C'(k)  - C* 


(5.52) 


so  that  C2(k)  represents  a "cost"  of  not  having  perfect  information. 
The  average  cost  per  transition  is 


C .«  (m)le>  <C,(m)H  (m)le> 
AC  - 11m  -i— + — * * 


(5.53) 


Cj  ^ F(m)g(m)  ^ C2(m)0(m)f(m) 


1 ^ ?(«)?(m) 

•H) 


(5.54) 


w«  My  r«wrlt*  the  nuneretor  of  the  first  term  es 


Cj  ^ P(in)g(m)  - Cj  ^ G(ra)f(m) 


(5.55) 


The  denominator  of  both  terms  is  <z\t>  . From  basic  considerations 


<zl€>-  ^ yg(y)  + X ^ g(y)]f(x)  (5.56) 

x-0  y-0  y-x 


This  enables  one  to  rewrite  (5.54)  as 


^ [CjG(x)  + C2(x)G(x) ] f (x) 


AC[F(x)]  ■ - 

• X 


I [I  ^ g(y)jF(x) 

X"0  y«K)  y-x 


(5.57) 


^ [R(x)J  f(x) 
x-0  

^ [S(x)J  f(x) 


(5.58) 


Since  X varies  from  0 to  • , and  assuming  neither  nor 
CjCx)  is  Infinite,  there  Is  some  x^  , perhaps  Infinite  (Implying  no  re- 
plenishment by  design),  that  minimises  the  bracketed  quotient  of  (5.58), 
i.e. , 


ws  that 


(»(x^)]  (R(x)] 


, 0 < X < 


(5.59) 


I ‘‘V  «*o)  I »w  *<•> 

sa 

• • • 

I •<*o)  *<*o)  \ 8(*)  «*) 


(5.60) 


t 


or 


P 


c 


f 

i 


0 

Q 


AC[P(x^)]  sAC[P(x)] 


(5.61) 


In  tuaury: 

Thoorm  5.3 

Por  Ch«  infinite  horizon,  auilti-declsion  cese  where  the  objective 
function  ie  minlmizetion  of  the  average  cost  per  transition,  the 
optiaal  policy  is  strictly  periodic  replenishment  of  information. 

This,  of  course,  is  not  an  unej^cted  result  for  an  infinite  horizon 
case.  However,  we  have  previously  shown  in  Exaiq>le  4 that  an  aperiodic 
policy  awy  be  optlael  for  e finite  horizon  case. 

5.7  Smtry 

This  chapter  has  illustrated  the  concept  of  information  replenish- 
ment and  optiMl  policies  of  replenishment  both  for  the  singly  occurring 
decision  and  swltiple  decisions.  The  use  of  the  geometric  distribution 
as  a benchmark  was  discussed,  and  the  final  theorem  rigorously  demon- 
streted  that  a periodic  acquisition  policy  was  optimal  under  certain 
limiting  conditions. 
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CHAPTER  6 

RELIABILITY  AND  MAINTAINABILITY  THEORY 
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6.1  Introduction 

In  Chapter  S we  were  able  to  capitalize  on  extant  reliability  theory 
for  proof  of  Theorem  5.3.  Reliability  theory  treats  the  failure  of  a 
piece  of  equipment,  i.e.,  its  deterioration  from  a superior  to  an  infe- 
rior state,  and  the  maintainability  of  equipment,  i.e..  optimal  strate- 
gies to  restore  the  equipment  to  a superior  state.  This  study  has 
treated  the  deterioration  and  restoration  of  information.  There  is  an 
easy  analogy  that  is  readily  apparent  and  that  lends  support  to  the 
greater  utilisation  of  the  well-established  reliability  theory.  This 
chapter  investigates  several  possibilities. 

6.2  Definitions 

Several  definitions  from  the  theory  will  be  useful. 

1.  Reliability.  The  classical  definition  is  the  probability 

of  a device  performing  its  purpose  adequately  for  the  period 
of  time  Intended  under  the  operating  conditions  encountered  [Ij. 

2.  Failure.  The  complement  of  reliability. 

3.  Failure  distribution.  [F(t)]  . The  distribution  function 
that  describes  the  failure  of  an  item  of  interest,  i.e.. 


I 


e 


F(t}  •>  i%’{item  has  failad  or  is  in  a failed  state  at 
time  t|c} 

4.  Failure  rata  function,  r(t)  ■ f(t)/F(t)  . (This  is  a widely 
used  concept  and  la  also  known  as  the  force  of  mortality,  the 
Mills  ratio,  ths  intensity  function,  and  the  hazard  rate.) 

5.  Xncreesing  failure  rete  (XFI). 

(e)  A continuous  feilure  distribution  is  IFR  if 

^ Ir(t)J  » 0 

(b)  IFR  Markov  chains.  Ns  have  previously  used  in  Section  5.4 
a definition  of  an  XFR  Markov  chain  that  is  due  to  Wagner  [23]. 
Aasias  e chain  with  states  0,  1,  2,  ...»  N such  that  the 
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greater  value  of  the  state  the  greater  the  deterioration  of 
the  item.  Then  the  chain  is  IFR  if 

Rr{s(n+1)  c Bls(n)«i,e} 

is  non- decreasing  in  i for  all  sets  0 of  the  form 

3 - {k,fcfl,lc+2 N) 

for  any  k*0,  1,  2 N.  Equivalently,  the  chain  is 

IFR  if 

N 

■ I 'ij 

j-k 

is  non- decreasing  in  i for  all  k , k - 0,  1,  2,  . . . , M . 
These  equivalent  definitions  connote  the  greater  the  value  of 
the  state  the  greater  the  probability  of  further  deterioration. 

6.3  Results 

6.3.1  Control  Limit  Rules 
Theorem  6.1 

Let  states  (0,1,2,. .. ,N}  be  states  of  a Markov  chain  where  the 
higher  maibered  states  represent  progressively  greater  deterioration 
of  an  item.  Let  C represent  the  cost  if  the  item  is  replaced  be- 
fore it  becomes  Inoperative  and  C-fA,  A2O,  the  cost  for  re- 
placement after  the  item  becomes  inoperative.  Then  the  optimal 
policy  is  to  replace  the  item  if  and  only  if  the  item  is  in  state 

i , l-fl,  14>2 N for  some  1 . (This  is  a control  limit  rule 

where  stats  1 represents  the  control  limit.) 

Proof 

Derman  [10]. 

The  calculation  of  i is  not  a trivial  procedure.  Ross  [6]  pre- 
sents a linear  programing  algorithm  to  determine  i . The  thrust  of  the 
thaorem  la  sufficiently  analogous  to  the  concepts  of  this  thasls  to  merit 
investigation  of  the  possibility  of  using  a control  limit  nils  to  deter- 
mine optimal  policies  for  information  replenishment. 
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6.3.2  Bounds  on  the  OpHimii  B*placement  Period 

A discussion  by  Barlow  and  Proschan  [9]  suggests  a method  of  bound- 
ing the  optimal  replacement  period  for  the  infinite  horizon  case. 

Me  let 


L(x) 


R(x) 

S(x) 


(6.1) 


where  R(x)  and  S(x)  are  defined  by  (5.58). 

For  some  x , say  x*  , to  be  optimal  is  equivalent  to 

L(x^)  a L(x*)  s X*  s Xj  (6.2) 

From  the  basic  definition  the  left-hand  inequality  is  equivalent  to 


Cl  G(Xj^)  + C2(Xj)  6(Xj^)  Cj^  G(x*)  + C2(x*)  G(x*) 


(6.3) 


I 


X* 

I 


k-0 

This  inequality  reduces  to 

X 


k-0 


M**)  G(x.)(C,(x*)  - C,(x.)].  _ C. 

; G(k)f—  2 ^ C - C(T^ 

“ W)  G(x*)[C  - C2(x*)]  J ^ 


or 


**^*1^  ^ - C2(x^) 


(6.4) 


(6.5) 


The  right-hand  inequality  in  (6.2)  reduces  to 

C, 


V * Cj  - C2(Xj) 


(6.6) 


Me  ere  considering  only  discrete  values  of  x . However,  the  con- 
tinuous version  of  the  right-hand  side  of  (6.6)  is  shown  in  Fig.  6-1.  In 
addition,  if  the  distribution  0(x)  Is  ITR  as  previously  defined,  then 
H<x)  is  Increasing  in  x . This  lisiits  the  optimal  region  to  x*  s x^  , 
where  • C2(x2>  " 0 . This  establishes  an  upper  bound  for  x*  . 
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From  (6.1)  and  (5.58)  we  may  establiah  that 


L(x)  - 


as  X **  w • where  C2  Is  defined  as 


C,  - max{c'(k)  - C*} 
^ k 


from  (5.52),  and 


- £ G(k) 


We  rewrite  L(x)  as 


Cl  + [C^  - C2(x) ] G(x) 


(6.7) 


There  Is  no  assurance  that  L(x)  ever  crosses  the  line  L(x)  * 

€2/^  > but  If  It  does.  It  crosses  to  the  right  of  the  Intersection  of 
that  line  and  C^^/x  . This  Intersection  Is  at  the  point  where  x ■ 
(C^/C2)|i  . Therefore,  this  Is  the  lower  bound  for  x*  . 

6.4  Summary 

This  chapter  has  been  Intentionally  brief.  The  purpose  has  been 
to  simply  suggest  the  possibilities  of  utilising  the  existing  definitions 
and  results  of  reliability  theory  to  develop  analogous  results  for  perish- 
ing inforution.  The  theory  allowed  the  establishment  of  bounds  on  the 
optiaml  acquisition  period  and  appeared  to  hold  some  promise  for  estab- 
lishing a control  limit  rule.  The  theory  of  reliability  is  extensive, 
and  the  results  here  are  only  a fairly  cursory  survey  of  possibly  applic- 
able approaches. 
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CHAPTER  7 

CONCLUSIONS  AND  EXTENSIONS 

This  Is  the  point  for  reflection  and  projection.  Where  are  we  now, 
and  where  do  we  have  to  go? 

The  goals  of  the  first  half  of  the  study  were  straightforward: 

1.  To  describe  the  phenomena  of  information  perishing. 

2.  To  develop  operational  definitions. 

3.  To  prove  the  inevitability  oi  infomation  perishing. 

4.  To  determine  several  parameters  to  describe  the  process. 

5.  To  consider  ancillary  areas  such  as  the  effect  of  risk 
aversion,  discounting  of  rewards,  and  contingency  decision 
making. 

These  goals  have  been  met. 

The  goals  of  the  second  half  of  the  thesis  were: 

1.  To  describe  information  replenishment. 

2.  To  develop  optimal  acquisition  policies  for  singly  occurring 
decisions. 

3.  To  develop  optimal  acquisition  policies  for  multiple  occurring 
decision. 

4.  To  suggest  parallels  from  reliability  theory. 

These  goals  have  also  been  met  with  the  reservation  that  bounds  %fere 
established  for  the  optimal  acquisition  policies  rather  than  precise  de- 
termination of  the  period  between  replenishments. 

The  accomplishment  of  these  goals  contribute  to  the  rationalization 
of  the  information  process.  However,  the  study  suggests  several  needed 
extensions.  These  can  be  grouped  into  two  main  divisions:  theoretical 
and  applied. 

The  theoretical  extensions  are: 

1.  The  development.  In  the  main,  centered  on  Markovian  informa- 
tion models.  A few  results  hold  for  any  probability  distribu- 
tion. The  generality  of  many  results  is  limited  by  the  Markov 
asstasption.  Consideration  of  other  dynamic  probability  models 
would  be  worthwhile. 
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2.  The  bounds  on  Che  optimal  period  between  information  acqulsi> 
tions  are  useful.  However,  an  algorithm  to  determine  an  exact 
value  of  the  optimal  period  would  enhance  these  results. 

3.  The  foray  into  the  thicket  of  reliability  theory  was  limited. 
Much  of  the  theory,  as  previously  noted,  concerns  parameter 
estimation  end  is  of  little  apparent  use.  However,  the  results 
concerning  inspection  and  equipment  replacement  may  have  signif- 
icant application  to  information  economics.  This  is  perhaps  the 
extension  of  most  immediate  potential. 

The  appllcatory  extensions  fall  into  two  sub-groups: 

1.  Descriptive 

Several  qiiestlons  would  merit  study  at  the  personal  and 
organization  level.  Do  people  and  organizations  recognize 
the  phenomena  of  Information  perishing?  If  so,  how  do  they 
cope  with  this  deterioration?  Is  there  a rationale  for 
allocation  of  resources  for  Information  acquisition?  Is  it 
related  to  information  perishing? 

2.  Normative. 

A body  of  theory  should  lead  to  a set  of  optimal  policies  to 
guide  individual  and  organizational  decision  makers  in  the 
allocation  of  resources  for  Information  collection,  analysis, 
and  use.  This  thesis,  along  with  the  other  Decision  Analysis 
research  efforts  cited  in  Chapter  2,  form  a basis  for  develop- 
ment of  such  a normative  theory. 

The  enormity  and  complexity  of  completing  such  a theory  is  apparent. 
However,  the  national  intelligence  budget  today  is  in  the  billions,  and 
further  billions  are  spent  in  information  acquisition  at  corporate  and 
individual  levels.  The  saving  of  even  a small  percentage  of  this  huge 
sum  would  merit  a major  and  dedicated  research  effort. 
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APPENDIX  A 

PRO(^  OF  THEOREMS  2.3  and  2.4 


A.l  Purposa 

The  purpose  of  this  appendix  is  to  prove  Theorems  2.3  and  2.4,  im- 
portant proofs  but  of  such  length  that  the  reader  is  distracted  from  the 
logic  of  Chapter  2. 

A. 2 Proof  of  Theorem  2.3 

Theorem  2.3  consists  of  showing  that  p(n)  £ |X^|  for  the  constant 
reward,  invariant  transition  matrix  case. 

Proof 

The  proof  is  by  induction  on  n . 


tor  M„>  ^ 0 
0 for  A(n)  • 0 


(A.l) 


where 


A(n)  ■ v*(n)  - v*(n) 


(A.  2) 


If  p(n)  - 0 , l.e.,  A(n-1)  or  A(n)  • 0 , then  p(n)  « is 

true  trivially.  Therefore,  we  shall  assimie  this  is  not  true.  For 
the  first  step  of  the  induction  we  prove  p(l)  £ |X^|  . By  defini- 


(A.  3) 


Let  M-1  be  the  transition  (counting  forward)  with  1 to  go  and  M 
be  the  transition  with  0 to  go.  From  (2.22)  we  may  deduce  that  the 
expected  reward  at  transition  M conditioned  on  some  starting  state, 
say  state  1,  Is 

<v(H)|*(0)-1.6(M)-6A.O  - -ax/ •••q^IN'^ 

i - - - - — — ^ 


\ .i’ll  I’l 


...  + 


4-1  .in’ll  M-l’U  ***  N-l’lH,} 


(A.  5) 


c 


!0 


'0 


0 


! t 


! » 


0 


0 


0 


I 


0 


«h«r«  . oq,, ««  .leaent.  of  [Q  ] wd 


O'*!!  * 0'‘!2 0'‘!N  -o‘ 

[R]  “ [r ^ ^3  » R ■ •••»  R"^ 

Thorcforc,  for  A(0)  v*  write 

A(0)  - ^ 1 I I 


i-! 


SiBl!er!y, 


J-0  j^! 

H-! 

I 

J-0 


1-! 


M 


N 


A(!)  - ^ n^{B«  1 \ I 

i-!  * J-0  j^!  i-! 


(0) 


(A.7) 


•!  H 


+ 1 ’^h*  I *5 1 I v*>'i 


(0) 


t-!  ^ J-0  i^!  i-! 

Prom  (A.!),  (A. 3),  (A. 6),  end  (A.7)  we  eee  thet 

A(0) 


“ A(0)  + aA(0) 
where  or  % ! by  Theorem  2.2.  Therefore, 


(A.  8) 


! ^ ! 


P<1>  "Tf^  ^2 


(A.  9) 


So  if  {Xj^l  ^ !/2  the  theorem  ho!dB. 

AeewM  now  thet  0 < |X^|  < !/2  • Then  one  muet  ehow  thet 


p(l) 


“ 1—1 t 


(0) 


j'l/?’}-  Ivi”*  W*;"  M I j’u't"'}-  Ivl” 

1 J 1 I I * J 1 t 


« IXil 

We  my  prove  Imquellty  (4.10)  ie  valid  by  the  mthod  of  contradict 


tlon.  Aeeum  (4.10)  la  ielae  and  that  p(l)  a jXjl  • *bla  iapllea 
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1 1 


0 


0 


0 


0 


1 * J 1 1 

• i'i'Mt  j’u'r*  T w i 

I 

■ 2 Ivl®’}  » “ 


“ J 1 J i 


Ut 


max 

k 


I >■"  I j’u'i'"’  • I I j’u'l’ 


J i 


J I 


and 


J i J i 

Tha  two  "constant"  terms  in  (A. 11)  ate  negative,  i.e., 

‘ IVi^°^  ^^\h\  Lvi^°^  ^ ° 


(A.  11) 


(A.  12) 


(A.  13) 


(A.  14) 


as  1^1  £ l'/2  by  assun^tion.  Therefore,  for  (A.  10)  to  be  positive, 

as  assumed, 

i i I i I 3 1 

(A.  15) 

■ust  be  greater  than  sero.  Let  expression  (A. IS)  T , end  by  as- 
sumption r 2 0 . As 


1 J i 1 J 1 


(A.  16) 


by  the  optimality  of  the  decisions,  then 


0 

1 i 4 


0 s r S ^ -JXj  - IXjl-  IXjlXj]  Xj  } ^ } 

i J I 

and 

0 s r " l^il"I^Uj]  L j’*u'k  } 

i j t 

If  th«  brackatad  axpraasion  containing  the  "|Xj^|"  terms  in  (A.  19)  Is 
positive,  the  right  side  of  the  inequality  is  positive;  conversely, 
if  this  term  is  negative,  the  right  side  is  negative.  If  Xj  2 0 , 
then 

[Xj  ” |Xjj  - |Xj|Xj]  s 0 , as  jXj^l  2 Xj  t Vj  (A. 20, 

Therefore,  for  Xj  2 0 , the  assuiiq;>tion  that  F 2 0 is  false. 
Similarly,  if  Xj  £ 0 , then 

-|Xj|  + jXjl  |Xjj  - jX^j  sO  , as  jXjl  « |Xj^|  s-j  (A. 21, 

by  assumption.  So  once  again  the  assusiption  that  F is  positive  is 
false.  Therefore,  %ie  conclude 


(A.  is: 


(A.  19] 


F s 0 (A. 22; 

which  is  contrary  to  inequality  (A.  17).  It  follows  that  the  assunq>- 
tion  that  p(l)  2 |X]^|  Is  false,  and 

p<l)  <\w  (A.23; 

He  continue  the  induction  by  asausdng 

* IhI  “ 

and  prove 

‘ M <*•« 


He  let  the 


n transition  to  go  -•  L - 1 transition  counting  forward 
n-1  transition  to  go  -•  L transition  counting  forward 
n~2  transition  to  go  •«  L ‘f  1 transition  counting  forward 


Lat  r.  and  r denote  optinal  decisions  at  L - 1 and  L , 
« X 

respectively.  Then  we  must  show  that 


I"!  )})  1 j’lt't**  ■ 

^ < iXjl 


Lin) 


i j 1 i 


(A.  26) 


Assume  the  contrary  of  the  proof,  or  that 

Therefore,  cross-multiplying  and  transposing  in  (A. 26)  yields 

1 J it  it 


i i 

+ Lin-2)  - |Xj|  Lin-1)  2 0 


(A.  27b) 


Mow  either  (A. 27a)  or  (A. 27b)  or  both  must  be  positive  to  satisfy  the 
inequality.  As  jX^)  < 1 t the  second  bracketed  expression  of  (A. 27a) 
is  less  than  aero.  Tharafore,  for  (A. 27a)  to  be  greater  than  sero, 
the  first  bracketed  term  seist  be  positive.  However,  by  reasoning 
similar  to  that  used  in  the  proof  of  (A. 16)  through  (A.  19)  and  noting 
the  optimality  of  the  decisions  we  see  that 


L”l[  ^ h J**!!*!  * ^ ^ W 1 

1 j 1 j i 1 j i it 


(A.  28) 


After  rewriting  the  left  side  of  (A.26)  one  may  show 


® * • iMi'  I i’k'D  ‘***> 


«s 

[Xj-lXillsO  (A.30) 

for  All  Xj  • Thus  (A. 27s)  Is  nsgstive  snd  (A. 27b)  Bust  be  positive 
for  our  S8Sviq>tion  to  hold.  For 

A(n.2)  - UJ  A(n.l)  i 0 (A.  31) 

then 

A(n-1)  * \\\  U.32) 

irhich  violates  the  induction  hypothesis.  He  have  now  shown  that 
both  (A. 27a)  and  (A. 27b)  are  negative,  contrary  to  our  assumptions. 

He  are  in  a position  to  state  an  inportant  result. 

Theorem  2.3 

For  the  n> state  Markov  process  with  k reward  decisions  and  an  in- 
variant transition  matrix  p(n)  ^ |Xj^|  * ^be  absolute  value  of  the 
largest  transient  eigenvalue, 

A. 2 Proof  of  Theorem  2.4 

He  had  limited  the  previous  proof  to  decision  situations  where  the 
decision  was  limited  to  a choice  of  state,  and  the  transition  matrix  was 
invariant.  However,  we  nay  also  extend  the  result  to  the  situation 
whsre  the  decision  maker  may  elect  not  only  the  reward  structure  but 
also  the  transition  matrix.  The  notation  is  the  same  as  for  the  previous 
theorem.  Our  method  of  proof  Is  to  develop  an  ei^ression  that  is  equiva- 
lent to  (A.4).  From  this  it  follows  that  the  remaining  proof  is  identical. 

The  theorsn  is 
Theorem  2.4 

for  a n-stata  Markov  process  let  k CK  represent  an  index  set  of 
reward  decisions  and  1 cL  represent  an  index  set  of  transition 
matrix  decisions.  Then 

p(n>  4 IXjl' 

Where  |Xj|'  - being  tbs  greatest 

abeoluU  value  of  tbs  traasleat  eifsnvalues  of  . 
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Proof 

W*  atsuBt  ch*  •xi«t«nco  of  son*  statloMry  optlaal  policy  bated  on 
Che  no- Infoxinetlon  cete  with  an  attoclated  ^ n ^ and  reward  ^ • 

Ue  alto  attuM  in  Che  proof  of 

< l^ll' 

Chat  we  nay  decermlne,  bated  on  an  optimal  policy  from  m - 0 to 
m > M^2  , Che  value  of  ^ , where  1 reflectt  the  Initial 

atartlng  ttate.  He  further  attune  the  decltlon  maker  electa  matrix 
[P*]  at  trantltlon  H>2  and  [P"]  at  trantltlon  H-1  to  that 


1 1 "l  , [P’J 

(A.  34) 

(1 

and 

1 ^1^**^  1 “l  1 

(A.  35) 

TTi(M-2)  , [P'J  [P-J 

(A.  36) 

0 

where  [P']  and  [P"]  may  or  may  not  be  the  tame  trantltlon  ma- 
trlcea.  He  further  attume  [P*]  hat  the  largett  (in  abtolute  value) 

eigenvalue. 

He  now  ute  N differential  matrlcet  to  expreta  P*  to  chat  (A. 36) 

0 

becomet 

0 

or 

(A.  37) 

(A.  38) 

iKprattloa  (A.36)  allowt  ut  to  write  the  expected  rewerd  at 

tranal- 

0 

tlon  M eoadltlooed  on  e(0)  ■ 1 ae 

<r(ll)|e(0)*i,6(IO"fi^»*>  • ■!«  , 1^00  , [Al 

U.39) 

0 

wtore  [E]  ■ fteelly 
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A notion,  widely  held  by  decision  analysts  but  tenuously  defined,  is  that 
the  value  of  any  specific  infonnation  diminishes  over  time.  This  concept, 
termed  infonnation  perishing,  is  rigorously  defined  and  illustrated  by 
the  use  of  a Markov  model  in  the  first  section  of  the  study. 

The  main  assertions  of  the  section  are: 

I 

1.  Information  perishing  is  inevitable  (not  only  for  the 
Markov  model  of  information  but  for  any  state  of 
information  described  by  a probability  distribution). 

2.  For  the  Markov  model,  the  absolute  value  of  the  largest 
transient  eigenvalue  is  an  upper  bound  for  the  rate  of 
information  perishing. 

3.  The  rate  of  perishing  is  a decreasing  function  of  time. 


A short  transition  section  alters  the  basic  decision  model  to  allow  an 
} element  of  uncertainty  for  the  exact  timing  of  the  decision.  Basically, 

the  new  model  of  the  decision  process  recognizes  that  many  decisions  in 
real  life  are  "triggered"  by  events  which  may  be  described  by  some 
stochastic  process.  Without  this  uncertainty,  the  decision-maker  could 
simply  discount  the  value  of  information  because  of  perishing  and  would 
reduce  his  problem  to  a static  case.  However,  the  uncertainty  in  timing 
3 forces  consideration  of  optimal  policies  of  information  replenishment, 

the  second  main  area  of  the  thesis. 

The  major  results  of  this  section  are: 


3 


1.  Rules  of  optimality  are  developed  for  singly  and  multiple 
occurring  decisions. 

2.  The  optimality  of  periodic  replenishment  (under  certain 
limiting  conditions)  is  established. 

3.  The  suggestion  that  some  of  the  research  results  of 
reliability  and  maintainability  theory  may  be  applied 
to  information  replenishment  strategy. 

The  thesis  closes  with  the  customary  delineation  of  areas  of  further 
application  and  research. 
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